tensorflow 如何使用一个数据系列的关系来预测第二个数据系列的结果？

qvtsj1bj 于 2023-08-06 发布在其他

关注(0)|答案(1)|浏览(121)

我有两个数据集，series 1和series 2。series 1是完整的，series 2是不完整的。我想使用series 1来预测series 2中的缺失值。

data= [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]
data2= [100, 200, 400, 800, 1600, 3200, 6400, 12800, np.nan, np.nan]

字符串
需要注意的是，系列1和系列2之间没有任何关系
我曾尝试使用TensorFlow和Scikit-learn来预测series 2中的缺失值，但我一直未能获得良好的结果。我们可以使用LSTM并使用系列1构建一个8和2的序列，但是当我将相同的方法应用于第二个系列时，与数据的主要关系仅适用于第一个系列，因此值不正确。Scikit-learn需要这两个系列进行回归。

from sklearn.preprocessing import StandardScaler
from keras.models import Sequential
from keras.layers import LSTM, Dense

# define the input time series
data= [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]

# define the lookback window and output window
lookback_window = 8
output_window = 2

# create a StandardScaler object to scale the data
scaler = StandardScaler()

# preprocess and scale the data
scaled_data = scaler.fit_transform(np.array(data).reshape(-1, 1))
X, y = preprocess_data(scaled_data, lookback_window, output_window)

# define and train the LSTM model
model = Sequential()
model.add(LSTM(32, input_shape=(lookback_window, 1)))
model.add(Dense(output_window))
model.compile(optimizer='adam', loss='mse')
X = X.reshape((X.shape[0], X.shape[1], 1))
model.fit(X, y, epochs=100, verbose=0)

# define the new time series
data1 = [100, 200, 400, 800, 1600, 3200, 6400, 12800]

# preprocess and scale the new data
scaled_new_data = scaler.transform(np.array(new_data).reshape(-1, 1))
new_X = np.array(scaled_new_data[-lookback_window:]).reshape((1, lookback_window, 1))

# make predictions for the next two values (of the new series)
prediction = scaler.inverse_transform(model.predict(new_X))
print(prediction)

型
输出预测偏离了很远，并不令人惊讶 544.34814 1022.2846
如何使用机器学习来解决这个问题？

tensorflow

来源：https://stackoverflow.com/questions/76676332/how-can-i-use-the-relationship-one-data-series-to-predict-the-outcome-of-a-secon

1条答案

按热度按时间

gstyhher1#

你给出了一个例子（x = [2，4，8，16，32，64，128，256]，y = [512，1024]），并在训练过程中重复100次，这样网络可能会学习总是输出相同的答案。
为了防止你的模型学习记忆，你必须增加训练集中的样本数量，或者减少模型的参数数量（为什么是32个LSTM单元？））。可能两者都有。
如果你可以假设序列中的值总是与前一个值进行线性运算的结果，那么放弃LSTM并使用简单的线性自回归模型可能会更好。
如果你不能对序列的模式做出任何假设，那么即使你给予它很多例子，也不可能训练出一个像样的机器学习模型（训练和测试数据必须来自相同的概率分布，训练数据必须具有足够的代表性，这样模型才能学习这种分布）。

赞(0）回复(0）举报 2023-08-06

我来回答

tensorflow 如何使用一个数据系列的关系来预测第二个数据系列的结果？

1条答案

相关问题

热门标签

最新问答