Keras模型拟合抛出形状不匹配错误

jrcvhitl  于 2022-11-13  发布在  其他
关注(0)|答案(2)|浏览(196)

我正在使用Keras(TensorFlow)构建一个Siamese网络,其中目标是一个二进制列,即匹配或不匹配(1或0)。但模型拟合方法抛出一个错误,称y_pred is not compatible with the y_true shape。我正在使用binary_crossentropy损失函数。
下面是我看到的错误:

下面是我使用的代码:

model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[tf.keras.metrics.Recall()])
history = model.fit([X_train_entity_1.todense(),X_train_entity_2.todense()],np.array(y_train),
                    epochs=2, 
                    batch_size=32,
                    verbose=2,
                    shuffle=True)

“我的输入数据”形状如下:

Inputs:
X_train_entity_1.shape is (700,2822)
X_train_entity_2.shape is (700,2822)

Target:
y_train.shape is (700,1)

在它抛出的错误中,y_pred是内部创建的变量。当我有一个二进制目标时,y_pred的维数是2822。2822维数实际上与输入大小匹配,但我如何理解这一点?
下面是我创建的模型:

in_layers = []
out_layers = []
for i in range(2):
  input_layer = Input(shape=(1,))
  embedding_layer = Embedding(embed_input_size+1, embed_output_size)(input_layer)
  lstm_layer_1 = Bidirectional(LSTM(1024, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(embedding_layer)
  lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)

  in_layers.append(input_layer)
  out_layers.append(lstm_layer_2)

merge = concatenate(out_layers)
dense1 = Dense(256, activation='relu', kernel_initializer='he_normal', name='data_embed')(merge)
drp1 = Dropout(0.4)(dense1)
btch_norm1 = BatchNormalization()(drp1)
dense2 = Dense(32, activation='relu', kernel_initializer='he_normal')(btch_norm1)
drp2 = Dropout(0.4)(dense2)
btch_norm2 = BatchNormalization()(drp2)
output = Dense(1, activation='sigmoid')(btch_norm2)
model = Model(inputs=in_layers, outputs=output)
model.summary()

因为我的数据非常稀疏,所以我使用了todense。这里的类型如下:

type(X_train_entity_1) is scipy.sparse.csr.csr_matrix
type(X_train_entity_1.todense()) is numpy.matrix
type(X_train_entity_2) is scipy.sparse.csr.csr_matrix
type(X_train_entity_2.todense()) is numpy.matrix

最后几层总结如下:

lsmepo6l

lsmepo6l1#

Input图层中的形状不匹配。输入形状需要与作为xdataset.shape[1:]传递的单个元素的形状匹配。因此,由于数据集大小为(700,2822),即大小为2822的700个样本。因此,输入形状应为2822。
变更:

input_layer = Input(shape=(1,))

收件人:

input_layer = Input(shape=(2822,))
kkih6yb8

kkih6yb82#

您需要将lstm_layer_2中的return_sequences设置为False

lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=False, recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)

否则,你仍然会有输入的时间步长,这就是为什么你会有(None, 2822, 1)的形状。你也可以在输出层之前添加一个Flatten层,但是我建议设置return_sequences=False
请注意,Denselayer计算输入与沿着输入最后一个轴的内核之间的点积。

相关问题