我正在使用Keras(TensorFlow)构建一个Siamese网络,其中目标是一个二进制列,即匹配或不匹配(1或0)。但模型拟合方法抛出一个错误,称y_pred is not compatible with the y_true shape
。我正在使用binary_crossentropy
损失函数。
下面是我看到的错误:
下面是我使用的代码:
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[tf.keras.metrics.Recall()])
history = model.fit([X_train_entity_1.todense(),X_train_entity_2.todense()],np.array(y_train),
epochs=2,
batch_size=32,
verbose=2,
shuffle=True)
“我的输入数据”形状如下:
Inputs:
X_train_entity_1.shape is (700,2822)
X_train_entity_2.shape is (700,2822)
Target:
y_train.shape is (700,1)
在它抛出的错误中,y_pred
是内部创建的变量。当我有一个二进制目标时,y_pred
的维数是2822。2822维数实际上与输入大小匹配,但我如何理解这一点?
下面是我创建的模型:
in_layers = []
out_layers = []
for i in range(2):
input_layer = Input(shape=(1,))
embedding_layer = Embedding(embed_input_size+1, embed_output_size)(input_layer)
lstm_layer_1 = Bidirectional(LSTM(1024, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(embedding_layer)
lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)
in_layers.append(input_layer)
out_layers.append(lstm_layer_2)
merge = concatenate(out_layers)
dense1 = Dense(256, activation='relu', kernel_initializer='he_normal', name='data_embed')(merge)
drp1 = Dropout(0.4)(dense1)
btch_norm1 = BatchNormalization()(drp1)
dense2 = Dense(32, activation='relu', kernel_initializer='he_normal')(btch_norm1)
drp2 = Dropout(0.4)(dense2)
btch_norm2 = BatchNormalization()(drp2)
output = Dense(1, activation='sigmoid')(btch_norm2)
model = Model(inputs=in_layers, outputs=output)
model.summary()
因为我的数据非常稀疏,所以我使用了todense。这里的类型如下:
type(X_train_entity_1) is scipy.sparse.csr.csr_matrix
type(X_train_entity_1.todense()) is numpy.matrix
type(X_train_entity_2) is scipy.sparse.csr.csr_matrix
type(X_train_entity_2.todense()) is numpy.matrix
最后几层总结如下:
2条答案
按热度按时间lsmepo6l1#
Input
图层中的形状不匹配。输入形状需要与作为x
或dataset.shape[1:]
传递的单个元素的形状匹配。因此,由于数据集大小为(700,2822)
,即大小为2822的700个样本。因此,输入形状应为2822。变更:
收件人:
kkih6yb82#
您需要将
lstm_layer_2
中的return_sequences
设置为False
:否则,你仍然会有输入的时间步长,这就是为什么你会有
(None, 2822, 1)
的形状。你也可以在输出层之前添加一个Flatten
层,但是我建议设置return_sequences=False
。请注意,
Dense
layer计算输入与沿着输入最后一个轴的内核之间的点积。