Keras模型拟合抛出形状不匹配错误

jrcvhitl  于 2022-11-13  发布在  其他
关注(0)|答案(2)|浏览(201)

我正在使用Keras(TensorFlow)构建一个Siamese网络,其中目标是一个二进制列,即匹配或不匹配(1或0)。但模型拟合方法抛出一个错误,称y_pred is not compatible with the y_true shape。我正在使用binary_crossentropy损失函数。
下面是我看到的错误:

下面是我使用的代码:

  1. model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[tf.keras.metrics.Recall()])
  2. history = model.fit([X_train_entity_1.todense(),X_train_entity_2.todense()],np.array(y_train),
  3. epochs=2,
  4. batch_size=32,
  5. verbose=2,
  6. shuffle=True)

“我的输入数据”形状如下:

  1. Inputs:
  2. X_train_entity_1.shape is (700,2822)
  3. X_train_entity_2.shape is (700,2822)
  4. Target:
  5. y_train.shape is (700,1)

在它抛出的错误中,y_pred是内部创建的变量。当我有一个二进制目标时,y_pred的维数是2822。2822维数实际上与输入大小匹配,但我如何理解这一点?
下面是我创建的模型:

  1. in_layers = []
  2. out_layers = []
  3. for i in range(2):
  4. input_layer = Input(shape=(1,))
  5. embedding_layer = Embedding(embed_input_size+1, embed_output_size)(input_layer)
  6. lstm_layer_1 = Bidirectional(LSTM(1024, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(embedding_layer)
  7. lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)
  8. in_layers.append(input_layer)
  9. out_layers.append(lstm_layer_2)
  10. merge = concatenate(out_layers)
  11. dense1 = Dense(256, activation='relu', kernel_initializer='he_normal', name='data_embed')(merge)
  12. drp1 = Dropout(0.4)(dense1)
  13. btch_norm1 = BatchNormalization()(drp1)
  14. dense2 = Dense(32, activation='relu', kernel_initializer='he_normal')(btch_norm1)
  15. drp2 = Dropout(0.4)(dense2)
  16. btch_norm2 = BatchNormalization()(drp2)
  17. output = Dense(1, activation='sigmoid')(btch_norm2)
  18. model = Model(inputs=in_layers, outputs=output)
  19. model.summary()

因为我的数据非常稀疏,所以我使用了todense。这里的类型如下:

  1. type(X_train_entity_1) is scipy.sparse.csr.csr_matrix
  2. type(X_train_entity_1.todense()) is numpy.matrix
  3. type(X_train_entity_2) is scipy.sparse.csr.csr_matrix
  4. type(X_train_entity_2.todense()) is numpy.matrix

最后几层总结如下:

lsmepo6l

lsmepo6l1#

Input图层中的形状不匹配。输入形状需要与作为xdataset.shape[1:]传递的单个元素的形状匹配。因此,由于数据集大小为(700,2822),即大小为2822的700个样本。因此,输入形状应为2822。
变更:

  1. input_layer = Input(shape=(1,))

收件人:

  1. input_layer = Input(shape=(2822,))
kkih6yb8

kkih6yb82#

您需要将lstm_layer_2中的return_sequences设置为False

  1. lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=False, recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)

否则,你仍然会有输入的时间步长,这就是为什么你会有(None, 2822, 1)的形状。你也可以在输出层之前添加一个Flatten层,但是我建议设置return_sequences=False
请注意,Denselayer计算输入与沿着输入最后一个轴的内核之间的点积。

相关问题