有一个双向LSTM模型,我不明白为什么在第二次实现model2.add(Bidirectional(LSTM(10,recurrent_dropout=0.2)))之后,结果我们得到了二维(None,20),但在第一个双向LSTM中我们得到了(None,409,20)。有谁能帮帮我吗?我如何在模型中添加自我关注层?
from tensorflow.keras.layers import LSTM,Dense, Dropout,Bidirectional
from tensorflow.keras.layers import SpatialDropout1D
from tensorflow.keras.layers import Embedding
from tensorflow.keras.preprocessing.text import Tokenizer
embedding_vector_length = 100
model2 = Sequential()
model2.add(Embedding(len(tokenizer.word_index) + 1, embedding_vector_length,
input_length=409) )
model2.add(Bidirectional(LSTM(10, return_sequences=True, recurrent_dropout=0.2)))
model2.add(Dropout(0.4))
model2.add(Bidirectional(LSTM(10, recurrent_dropout=0.2)))
model2.add(SeqSelfAttention())
#model.add(Dropout(dropout))
#model2.add(Dense(256, activation='relu'))
#model.add(Dropout(0.2))
model2.add(Dense(3, activation='softmax'))
model2.compile(loss='binary_crossentropy',optimizer='adam',
metrics=['accuracy'])
print(model2.summary())
并且输出:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_23 (Embedding) (None, 409, 100) 1766600
_________________________________________________________________
bidirectional_12 (Bidirectio (None, 409, 20) 8880
_________________________________________________________________
dropout_8 (Dropout) (None, 409, 20) 0
_________________________________________________________________
bidirectional_13 (Bidirectio (None, 20) 2480
_________________________________________________________________
dense_15 (Dense) (None, 3) 63
=================================================================
Total params: 1,778,023
Trainable params: 1,778,023
Non-trainable params: 0
_________________________________________________________________
None
2条答案
按热度按时间ruoxqz4g1#
对于第二个Bidirectional-LSTM,默认情况下,return_sequences设置为False。因此,该层的输出将类似于多对一。如果你想得到每个time_step的输出,那么只需使用model2.add(Bidirectional(LSTM(10,return_sequences=True,recurrent_dropout=0.2)))。
有关LSTM中的注意力机制,您可以参考此链接和this链接。
3wabscal2#
试试这个
focus on merge_mode=None https://www.tensorflow.org/api_docs/python/tf/keras/layers/Bidirectional