我正在用Keras编写一个机器学习算法,我需要在输入数据之前将其规范化。我有3个输入组织成一个2D数组,每列组成一个输入。
import tensorflow as tf
import keras
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
#Importing all the required modules
raw_data = np.array([]) #Defining numpy array for training data
val_data = np.array([]) #Defining numpy array for validation data
test = np.array([]) #Defining numpy array for test data
rawfilepath = r'C:\Users\***\Desktop\***\Unprocessed_Data_For_Training.txt'
valfilepath = r'C:\Users\***\Desktop\***\Unprocessed_Data_For_Validation.txt'
testfilepath = r'C:\Users\***\Desktop\***\h4t6usedforprediction.txt' #Filepaths
raw_data = np.loadtxt(rawfilepath)
val_data = np.loadtxt(valfilepath)
test = np.loadtxt(testfilepath) #Loading contents of text files into their respective arrays
X = raw_data[:, 1:4] #Splitting the data, X contains the coordinate position, initial shear and initial
Y = raw_data[:, 0] #Splitting the data, Y contains the measured height
X_Val = val_data[:, 1:4]
Y_Val = val_data[:, 0]
X_test = test[:, 1:4]
Y_test = test[:, 0]
scalar = MinMaxScaler()
#print(X_test)
#print(Y_test)
print(X)
print(Y)
scaler = MinMaxScaler()
Xnorm = scaler.fit_transform(X)
Ynorm = scaler.fit_transform(Y.reshape(-1,1))
Xvalnorm = scaler.fit_transform(X_Val)
Yvalnorm = scaler.fit_transform(Y_Val.reshape(-1,1))
Xtestnorm = scaler.fit_transform(X_test)
Ytestnorm = scaler.fit_transform(Y_test.reshape(-1,1))
Y变量正常化很好,但是我认为X变量是用整个数组而不是逐列正常化的。
这些是模型用于进行预测的输入。
X=[0.94941569 0. 0. ], Predicted=[0.02409407]
X=[0.95664225 0. 0. ], Predicted=[0.02374389]
X=[0.93496738 0. 0. ], Predicted=[0.02480936]
X=[0.94219233 0. 0. ], Predicted=[0.02444912]
X=[0.92774402 0. 0. ], Predicted=[0.02517468]
X=[0.92052067 0. 0. ], Predicted=[0.02554525]
X=[0.91329892 0. 0. ], Predicted=[0.02592104]
X=[0.90607877 0. 0. ], Predicted=[0.02630214]
X=[0.89885863 0. 0. ], Predicted=[0.02668868]
X=[0.89163848 0. 0. ], Predicted=[0.02708073]
X=[0.88441994 0. 0. ], Predicted=[0.0274783]
X=[0.87720299 0. 0. ], Predicted=[0.02788144]
1条答案
按热度按时间qc6wkl3g1#
让我们分部分来做:
1 -如果
X
和Y
是您的train
集合,则在该集合中调用fit_transform
是正确的。但是您不能再次fit_transform
您的validation
和test
集合。您必须使用先前定义的scaler
对它们执行transform
:2 -我假设您在最后发布的
X
的值已经是您从规范化中得到的值,因此,我创建my_X
只是为了举例说明如何使用sklearn
规范化一些数据:只需将
my_X
的值更改为X
中的值。