ludwig 在单个GPU上微调Llama2-7b模型时，Token级别的概率始终为0.0,

56lgkhnf 于 2个月前发布在其他

关注(0)|答案(1)|浏览(26)

描述问题

在使用"Ludwig + DeepLearning.ai: Efficient Fine-Tuning for Llama2-7b on a Single GPU.ipynb"对Llama2-7b模型进行微调时，token级别的概率始终显示为0.0。
https://colab.research.google.com/drive/1Ly01S--kUwkKQalE-75skalp-ftwl0fE?usp=sharing
以下是我的代码中存在问题的部分...
https://colab.research.google.com/drive/1OmbCKlPzlxm4__iThYqB9PSLUWZZVptz?usp=sharing

重现问题

重现此行为所需的步骤：

使用提供的笔记本对Llama2-7b模型进行微调。
使用修改后的参数执行模型的预测，包括将 predict 设置为 False 并提供特定的 output_directory 。
尽管进行了修改，但token级别的概率仍然保持为0.0。

ludwig.predict(
  dataset=None,
  data_format=None,
  split='full',
  batch_size=128,
  skip_save_unprocessed_output=True,
  skip_save_predictions=True,
  output_directory='results',
  return_type=<class 'pandas.core.frame.DataFrame'>,
  debug=False
)

预期行为

token级别的概率应反映模型预测每个token输出的置信度。

ludwig.predict(
  dataset=None,
  data_format=None,
  split='full',
  batch_size=128,
  skip_save_unprocessed_output=True,
  skip_save_predictions=True,
  output_directory='results',
  return_type=<class 'pandas.core.frame.DataFrame'>,
  debug=False
)

截图

无

ludwig.predict(
  dataset=None,
  data_format=None,
  split='full',
  batch_size=128,
  skip_save_unprocessed_output=True,
  skip_save_predictions=True,
  output_directory='results',
  return_type=<class 'pandas.core.frame.DataFrame'>,
  debug=False
)

环境：

OS: Ubuntu 20.04
Python版本：3.8.10
Ludwig版本：0.3.3

附加上下文

预测函数内的日志记录器似乎无法按预期运行。

ludwig

来源：https://github.com/ludwig-ai/ludwig/issues/3979

1条答案

按热度按时间

e4eetjau1#

你好，@MoOo2mini -感谢你使用Ludwig的LLM微调功能并报告你的问题。我们无法重现你的错误，因为我们无法访问你的模型：

FileNotFoundError: [Errno 2] No such file or directory: '/content/test/model_hyperparameters.json'

你能把你的模型提供出来(例如在HuggingFace上),我会很高兴帮你解决问题。
非常感谢。

赞(0）回复(0）举报 2个月前

我来回答

ludwig 在单个GPU上微调Llama2-7b模型时，Token级别的概率始终为0.0,

1条答案

相关问题

热门标签

最新问答