我想微调基于BERT的已微调模型,以便在16类数据集上再次使用7类进行分类:
MODEL_NAME_OR_PATH = 'some pretrained model for 7 class classification on huggingface repo'
model = build_model(MODEL_NAME_OR_PATH, learning_rate=LEARNING_RATE)
def build_model(model_name, learning_rate=3e-5):
model = TFBertForSequenceClassification.from_pretrained(model_name)
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
model.compile(optimizer=optimizer, loss=loss, metrics=[metric])
return model
r = model.fit(
train_dataset,
validation_data=valid_dataset,
steps_per_epoch=train_steps,
validation_steps=valid_steps,
epochs=EPOCHS,
verbose=1)
正如预期的那样,模型在最后一层预期7个类,并产生以下错误:
Node:
'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
Received a label value of 9 which is outside the valid range of [0, 8). Label values: 6 2 0 6 0 9 6 6 0 6 6 0 7 2 2 2
[[{{node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_43224]
应该如何编辑模型的结构?
1条答案
按热度按时间xuo3flqw1#
为了进一步参考,你需要编辑最后一层。在我的例子中,我使用了tensorflow: