描述bug
我为fine tuning mistralai/Mistral-7B-v0.1启用了4位量化。似乎Ludwig 0.10.1依赖于bitsandbytes < 0.41.0。但是当我运行训练器时,我得到了以下警告:
You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it.
If you want to save 4-bit models, make sure to have `bitsandbytes>=0.41.3` installed.
重现问题
重现问题的步骤:
- 安装Ludwig
pip install ludwig[full]
- 配置文件(
model.yaml
):
model_type: llm
base_model: mistralai/Mistral-7B-v0.1
quantization:
bits: 4
adapter:
type: lora
prompt:
template: |
### Instruction:
{instruction}
### Input:
{input}
### Response:
input_features:
- name: prompt
type: text
output_features:
- name: output
type: text
generation:
temperature: 0.1
trainer:
type: finetune
epochs: 3
optimizer:
type: paged_adam
batch_size: 1
eval_steps: 100
learning_rate: 0.0002
eval_batch_size: 2
steps_per_checkpoint: 1000
learning_rate_scheduler:
decay: cosine
warmup_fraction: 0.03
gradient_accumulation_steps: 16
enable_gradient_checkpointing: true
preprocessing:
sample_ratio: 0.1
- 训练模型:
ludwig train --config model.yaml --dataset "ludwig://alpaca"
预期行为
在不支持4位量化的bitsandbytes
版本上,不应该显示警告。
环境信息(请填写以下信息):
- OS: Linux
- 版本:6.7.6-arch1-1
- Python: 3.10.8
- Ludwig: v0.10.1
2条答案
按热度按时间lsmd5eda1#
以下是显示运行情况的笔记本。第一次运行要求重新启动,在完成所有单元格的运行后,输出为 https://colab.research.google.com/drive/1kmZhQKBzpHBJRJvvp9PEdPEUMfMu6dh7?usp=sharing 。仅供参考...顺便说一下,模型的输出为 "","",但这很可能是基本模型的问题!! [
[@shripadk@alexsherstinsky]
4smxwvx52#
随着更多的迭代次数,Gemma微调似乎效果很好。https://console.cloud.google.com/vertex-ai/colab/notebooks?project=document-ai-374204&activeNb=projects%2Fdocument-ai-374204%2Flocations%2Fus-central1%2Frepositories%2F87000216-df46-4358-8bb1-6bc933f4c82b [@shripadk@alexsherstinsky]