我正在使用pytorch使用Hugging人脸模型训练一个简单的二元分类模型。
伯特PyTorch拥抱脸。
代码如下:
import transformers
from transformers import TFAutoModel, AutoTokenizer
from tokenizers import Tokenizer, models, pre_tokenizers, decoders, processors
from transformers import AutoTokenizer
from transformers import AdamW
from transformers import get_linear_schedule_with_warmup
from transformers import BertTokenizerFast as BertTokenizer, BertModel, AdamW, get_linear_schedule_with_warmup,BertConfig
def compute_metrics(eval_pred):
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
acc = np.sum(predictions == labels) / predictions.shape[0]
return {"accuracy": acc,
'precision': metrics.precision_score(labels, predictions),
'recall': metrics.recall_score(labels, predictions),
'f1': metrics.f1_score(labels, predictions)}
training_args = tr.TrainingArguments(
#report_to = 'wandb',
output_dir='/home/pc/proj/Exp2_conv_stampy_data/results_exp0', # output directory
overwrite_output_dir = True,
num_train_epochs=2, # total number of training epochs
per_device_train_batch_size=32, # batch size per device during training
per_device_eval_batch_size=32, # batch size for evaluation
learning_rate=2e-5,
warmup_steps=200, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs_exp0', # directory for storing logs
logging_steps=137,
evaluation_strategy="epoch"
,save_strategy="epoch"
,load_best_model_at_end=True
,fp16=True
,run_name="final_model0"
)
# counter = 0
# results_lst = []
from transformers import TrainerCallback
from copy import deepcopy
model = tr.XLMRobertaForSequenceClassification.from_pretrained("/home/pc/multilingual_toxic_xlm_roberta",problem_type="single_label_classification", num_labels=2,ignore_mismatched_sizes=True, id2label={0: 'negative', 1: 'positive'})
train_encodings = tokenizer(train_texts, truncation=True, padding=True, max_length=512, return_tensors="pt")
val_encodings = tokenizer(val_texts, truncation=True, padding=True, max_length=512, return_tensors="pt")
train_data = SEDataset(train_encodings, train_labels)
val_data = SEDataset(val_encodings, val_labels)
model.to(device)
class CustomCallback(TrainerCallback):
def __init__(self, trainer) -> None:
super().__init__()
self._trainer = trainer
def on_epoch_end(self, args, state, control, **kwargs):
if control.should_evaluate:
control_copy = deepcopy(control)
self._trainer.evaluate(eval_dataset=self._trainer.train_dataset, metric_key_prefix="train")
return control_copy
trainer = tr.Trainer(
model=model, # the instantiated Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_data, # training dataset
eval_dataset=val_data, # evaluation dataset
compute_metrics=compute_metrics # the callback that computes metrics of interest
)
trainer.add_callback(CustomCallback(trainer))
train = trainer.train()
trainer.save_model("/home/pc/proj/Exp2_conv_stampy_data/result_toxic_model_exp0")
我看到默认情况下创建了mlruns
目录。
什么是0' and what are these 2 folders inside
0?** **如何重命名为有用和可理解的东西。?** **如果我运行多个运行,我如何在同一个实验下记录每个运行的模型,例如
run1,
run2`?
我还看到工件文件夹是空的,如何记录最终模型?
1条答案
按热度按时间p4rjhz4m1#
默认情况下,MLflow将运行元数据和工件存储在本地mlruns文件夹中。为了检查运行,您可以运行
mlflow ui
命令来启动跟踪服务器UI。实验是在实验中进行的。您可以创建一个实验,每次开始运行时,它都会记录在指定的实验下。为此,您应该指定跟踪服务器uri并在代码中进行实验:
如果您没有远程跟踪服务器,那么您应该如上所述在本地运行MLflow UI。
HuggingFace具有MLflow集成,用于自动记录运行。你需要设置一些环境变量:https://huggingface.co/docs/transformers/v4.33.0/en/main_classes/callback#transformers.integrations.MLflowCallback