keras LSTM中使用可变长度输入的掩码不起作用

cld4siwp 于 2022-11-13 发布在其他

关注(0)|答案(1)|浏览(134)

我正在构建一个LSTM模型，输入的是可变长度的数组。在很多参考资料中，我被建议做填充，即插入0，直到所有输入数组的长度相同，然后应用屏蔽，使模型忽略0。
但是，经过多次训练后，我感觉Masking并不像预期的那样起作用，输入中填充的0仍然影响模型的预测能力。
将所有序列连接到一个数组后，我的训练数据如下所示**，没有填充**：

X           y
[1 2 3]     4
[2 3 4]     5
[3 4 5]     6
...

我的python实现：

import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Masking
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator

""" Raw Training Input """
arr = np.array([
    [1, 2, 3, 4, 5, 6],
    [5, 6, 7],
    [11, 12, 13, 14]
], dtype=object)

timesteps = 3
n_features = 1
maxlen = 6

""" Padding """
padded_arr = pad_sequences(arr, maxlen=maxlen, padding='pre', truncating='pre')
""" Concatenate all sequences into one array """
sequence = np.concatenate(padded_arr)
sequence = sequence.reshape((len(sequence), n_features))
# print(sequence)

""" Training Data Generator """
generator = TimeseriesGenerator(sequence, sequence, length=timesteps, batch_size=1)
""" Check Generator """
for i in range(len(generator)):
    x, y = generator[i]
    print('%s => %s' % (x, y))

""" Build Model """    
model = Sequential()
model.add(Masking(mask_value=0.0, input_shape=(timesteps, n_features))) # masking to ignore padded 0
model.add(LSTM(1024, activation='relu', input_shape=(timesteps, n_features), return_sequences=False))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(generator, steps_per_epoch=1, epochs=1000, verbose=10)

""" Prediction """
x_input = np.array([2,3,4]).reshape((1, timesteps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat) # here I'm expecting 5 because the input is [2, 3, 4]

对于预测，我输入[2,3,4]，大多数时候我得到的值与预期值（= 5）相差很远
我想知道我是否错过了一些东西，或者仅仅是因为模型架构没有正确调整。
我想了解为什么模型预测不正确。是掩蔽问题还是其他问题？

keras

来源：https://stackoverflow.com/questions/73762984/masking-in-lstm-with-variable-length-input-does-not-work

1条答案

按热度按时间

2jcobegt1#

问题是批次大小为1，而且每个时段只有一个步骤。因此，无法计算任何有意义的梯度。您必须将所有训练数据放入一个批次中，您应该会有良好的结果：

""" Training Data Generator """
generator = TimeseriesGenerator(sequence, sequence, length=timesteps,
                            batch_size=15)

[或者，您可以将批处理大小保留为1，但将steps_per_epoch更改为len(generator)，这似乎可以与adam优化器一起使用，但不能与SGD一起使用。而且速度要慢得多。]

赞(0）回复(0）举报 2022-11-13

我来回答

keras LSTM中使用可变长度输入的掩码不起作用

1条答案

相关问题

热门标签

最新问答