keras 如何使用SKLearn按列规范化python数组中的数据?

8ulbf1ek  于 2023-03-08  发布在  Python
关注(0)|答案(1)|浏览(133)

我正在用Keras编写一个机器学习算法,我需要在输入数据之前将其规范化。我有3个输入组织成一个2D数组,每列组成一个输入。

import tensorflow as tf
    import keras
    import numpy as np
    from keras.models import Sequential
    from keras.layers import Dense, Activation, Dropout
    import matplotlib.pyplot as plt
    from sklearn.preprocessing import MinMaxScaler
    #Importing all the required modules

    raw_data = np.array([]) #Defining numpy array for training data
    val_data = np.array([]) #Defining numpy array for validation data
    test = np.array([]) #Defining numpy array for test data
    rawfilepath = r'C:\Users\***\Desktop\***\Unprocessed_Data_For_Training.txt'
    valfilepath = r'C:\Users\***\Desktop\***\Unprocessed_Data_For_Validation.txt'
    testfilepath = r'C:\Users\***\Desktop\***\h4t6usedforprediction.txt' #Filepaths 
    raw_data = np.loadtxt(rawfilepath)
    val_data = np.loadtxt(valfilepath)
    test = np.loadtxt(testfilepath) #Loading contents of text files into their respective arrays
    X = raw_data[:, 1:4] #Splitting the data, X contains the coordinate position, initial shear and initial  
    Y = raw_data[:, 0] #Splitting the data, Y contains the measured height
    X_Val = val_data[:, 1:4]
    Y_Val = val_data[:, 0]
    X_test = test[:, 1:4]
    Y_test = test[:, 0]
    scalar = MinMaxScaler()
    #print(X_test)
    #print(Y_test)
    print(X)
    print(Y)

    scaler = MinMaxScaler()
    Xnorm = scaler.fit_transform(X) 
    Ynorm = scaler.fit_transform(Y.reshape(-1,1))
    Xvalnorm = scaler.fit_transform(X_Val)
    Yvalnorm = scaler.fit_transform(Y_Val.reshape(-1,1))
    Xtestnorm = scaler.fit_transform(X_test)
    Ytestnorm = scaler.fit_transform(Y_test.reshape(-1,1))

Y变量正常化很好,但是我认为X变量是用整个数组而不是逐列正常化的。
这些是模型用于进行预测的输入。

X=[0.94941569 0.         0.        ], Predicted=[0.02409407]
X=[0.95664225 0.         0.        ], Predicted=[0.02374389]
X=[0.93496738 0.         0.        ], Predicted=[0.02480936]
X=[0.94219233 0.         0.        ], Predicted=[0.02444912]
X=[0.92774402 0.         0.        ], Predicted=[0.02517468]
X=[0.92052067 0.         0.        ], Predicted=[0.02554525]
X=[0.91329892 0.         0.        ], Predicted=[0.02592104]
X=[0.90607877 0.         0.        ], Predicted=[0.02630214]
X=[0.89885863 0.         0.        ], Predicted=[0.02668868]
X=[0.89163848 0.         0.        ], Predicted=[0.02708073]
X=[0.88441994 0.         0.        ], Predicted=[0.0274783]
X=[0.87720299 0.         0.        ], Predicted=[0.02788144]
qc6wkl3g

qc6wkl3g1#

让我们分部分来做:
1 -如果XY是您的train集合,则在该集合中调用fit_transform是正确的。但是您不能再次fit_transform您的validationtest集合。您必须使用先前定义的scaler对它们执行transform

scaler = MinMaxScaler()
Xnorm = scaler.fit_transform(X) 
Ynorm = scaler.fit_transform(Y.reshape(-1,1))
Xvalnorm = scaler.transform(X_Val)
Yvalnorm = scaler.transform(Y_Val.reshape(-1,1))
Xtestnorm = scaler.transform(X_test)
Ytestnorm = scaler.transform(Y_test.reshape(-1,1))

2 -我假设您在最后发布的X的值已经是您从规范化中得到的值,因此,我创建my_X只是为了举例说明如何使用sklearn规范化一些数据:

my_X = np.array([[-3, 2, 4], [-6, 4, 1], [0, 10, 15], [12, 18, 31]])
scaler = MinMaxScaler()
scaler.fit_transform(my_X)

只需将my_X的值更改为X中的值。

相关问题