我已经用我的大脑预先训练了BERT模型。
我使用的是一个微调过的Roberta模型,它是无偏有毒的--Roberta是在Jigsaw数据上训练的:
https://huggingface.co/unitary/unbiased-toxic-roberta
使用pytorch数据集创建数据
tokenizer = tr.RobertaTokenizer.from_pretrained("/home/pc/unbiased_toxic_roberta")
train_encodings = tokenizer(train_texts, truncation=True, padding=True, max_length=512, return_tensors="pt")
class SEDataset(torch.utils.data.Dataset):
def __init__(self, encodings, labels):
self.encodings = encodings
self.labels = labels
def __getitem__(self, idx):
item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
item['labels'] = torch.tensor(self.labels[idx])
return item
def __len__(self):
return len(self.labels)
train_data = SEDataset(train_encodings, train_labels)
def compute_metrics(eval_pred):
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
acc = np.sum(predictions == labels) / predictions.shape[0]
return {"accuracy" : acc}
在预训练模型上添加几层的模型:
import torch.nn as nn
from transformers import AutoModel
class PosModel(nn.Module):
def __init__(self):
super(PosModel, self).__init__()
self.base_model = tr.RobertaForSequenceClassification.from_pretrained('/home/pc/unbiased_toxic_roberta')
self.dropout = nn.Dropout(0.5)
self.linear = nn.Linear(768, 2) # output features from bert is 768 and 2 is ur number of labels
def forward(self, input_ids, attn_mask):
outputs = self.base_model(input_ids, attention_mask=attn_mask)
# You write you new head here
outputs = self.dropout(outputs[0])
outputs = self.linear(outputs)
return outputs
model = PosModel()
print(model)
培训步骤:
使用TrainingArguments将一些参数传递给模型
training_args = tr.TrainingArguments(
# report_to = 'wandb',
output_dir='/home/pc/1_Proj_hate_speech/results_roberta', # output directory
overwrite_output_dir = True,
num_train_epochs=20, # total number of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=32, # batch size for evaluation
learning_rate=2e-5,
warmup_steps=1000, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs3', # directory for storing logs
logging_steps=1000,
evaluation_strategy="epoch"
,save_strategy="epoch"
,load_best_model_at_end=True
)
trainer = tr.Trainer(
model=model, # the instantiated 🤗 Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_data, # training dataset
eval_dataset=val_data, # evaluation dataset
compute_metrics=compute_metrics
)
运行模型
trainer.train()
错误:
TypeError: Caught TypeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/pc/.local/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/pc/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'input_ids'
2条答案
按热度按时间ct3nt3jp1#
看起来你的标记器在编码数据的时候添加了“input_ids”信息,但是模型并不期望这个Tensor出现在输入上。也许你可以尝试从
train_encodings
中删除这个数据,然后再试一次。xzlaal3s2#
我有同样的问题,我做了一个名为“模型”的函数,我正在调用这个函数。我想你在最后做了同样的事情。请检查。