python 变压器使用不同等级进行微调

iyfjxgzm  于 2023-02-28  发布在  Python
关注(0)|答案(1)|浏览(125)

我想微调基于BERT的已微调模型,以便在16类数据集上再次使用7类进行分类:

MODEL_NAME_OR_PATH = 'some pretrained model for 7 class classification on huggingface repo'
model = build_model(MODEL_NAME_OR_PATH, learning_rate=LEARNING_RATE)

def build_model(model_name, learning_rate=3e-5):
    model = TFBertForSequenceClassification.from_pretrained(model_name)

    optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
    model.compile(optimizer=optimizer, loss=loss, metrics=[metric])

   return model
r = model.fit(
    train_dataset,
    validation_data=valid_dataset,
    steps_per_epoch=train_steps,
    validation_steps=valid_steps,
    epochs=EPOCHS,
    verbose=1)

正如预期的那样,模型在最后一层预期7个类,并产生以下错误:

Node: 
'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
Received a label value of 9 which is outside the valid range of [0, 8).  Label values: 6 2 0 6 0 9 6 6 0 6 6 0 7 2 2 2
     [[{{node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_43224]

应该如何编辑模型的结构?

xuo3flqw

xuo3flqw1#

为了进一步参考,你需要编辑最后一层。在我的例子中,我使用了tensorflow:

model.classifier = tf.keras.layers.Dense(nunits)

相关问题