ludwig 在训练过程中抛出'IndexError: Dimension specified as 0 but tensor has no dimensions'错误,

exdqitrt  于 5个月前  发布在  其他
关注(0)|答案(6)|浏览(52)

描述错误

我正在尝试在谷歌Colab上运行LLM_few-shot示例(https://github.com/ludwig-ai/ludwig/blob/master/examples/llm_few_shot_learning/simple_model_training.py),并在训练阶段遇到以下错误。
====日志====

INFO:ludwig.models.llm:Done.
INFO:ludwig.utils.tokenizers:Loaded HuggingFace implementation of facebook/opt-350m tokenizer
INFO:ludwig.trainers.trainer:Tuning batch size...
INFO:ludwig.utils.batch_size_tuner:Tuning batch size...
INFO:ludwig.utils.batch_size_tuner:Exploring batch_size=1
INFO:ludwig.utils.checkpoint_utils:Successfully loaded model weights from /tmp/tmpnqj9shge/latest.ckpt.
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[<ipython-input-15-cbbfc30da30b>](https://localhost:8080/#) in <cell line: 6>()
      4     preprocessed_data,  # tuple Ludwig Dataset objects of pre-processed training data
      5     output_directory,  # location of training results stored on disk
----> 6 ) = model.train(
      7     dataset=df,experiment_name="simple_experiment", model_name="simple_model", skip_save_processed_input=False)
      8 

10 frames
[/usr/local/lib/python3.10/dist-packages/ludwig/models/llm.py](https://localhost:8080/#) in _remove_left_padding(self, input_ids_sample)
    629         else:
    630             pad_idx = 0
--> 631         input_ids_sample_no_padding = input_ids_sample[pad_idx + 1 :]
    632 
    633         # Start from the first BOS token

IndexError: Dimension specified as 0 but tensor has no dimensions

====日志结束====

配置如下

config = yaml.unsafe_load(
    """
model_type: llm
model_name: facebook/opt-350m
generation:
    temperature: 0.1
    top_p: 0.75
    top_k: 40
    num_beams: 4
    max_new_tokens: 64
prompt:
    task: "Classify the sample input as either negative, neutral, or positive."
    retrieval:
        type: semantic
        k: 3
        model_name: paraphrase-MiniLM-L3-v2
input_features:
-
    name: review
    type: text
output_features:
-
    name: label
    type: category
    preprocessing:
        fallback_label: "neutral"
    decoder:
        type: category_extractor
        match:
            "negative":
                type: contains
                value: "positive"
            "neural":
                type: contains
                value: "neutral"
            "positive":
                type: contains
                value: "positive"
preprocessing:
    split:
        type: fixed
    
trainer:
    type: finetune
    epochs: 2
    
    """
)

环境信息(请填写以下信息):

操作系统:Colab
Python版本:3.10
Ludwig版本:0.8

附加上下文

kt06eoxx

kt06eoxx1#

嘿,@chayanray -感谢你标记这个问题!这是一个已知的问题。我将在当天结束时为此创建一个修复程序,你应该能够测试出来

ajsxfq5m

ajsxfq5m2#

你好@chayanray,我已经在这里创建了一个修复并测试了你正在尝试运行的相同示例笔记本:#3432
对我来说,事情似乎正常工作。你能拉取这个分支并看看这是否解决了问题吗?

rryofs0p

rryofs0p3#

@arnavgarg1 , No still the same error.

`INFO:ludwig.models.llm:Done.

INFO:ludwig.utils.tokenizers:Loaded HuggingFace implementation of facebook/opt-350m tokenizer
INFO:ludwig.trainers.trainer:Tuning batch size...
INFO:ludwig.utils.batch_size_tuner:Tuning batch size...
INFO:ludwig.utils.batch_size_tuner:Exploring batch_size=1
INFO:ludwig.utils.checkpoint_utils:Successfully loaded model weights from /tmp/tmphojxmpa7/latest.ckpt.
IndexError Traceback (most recent call last)
in <cell line: 6>()
4 preprocessed_data, # tuple Ludwig Dataset objects of pre-processed training data
5 output_directory, # location of training results stored on disk
----> 6 ) = model.train(
7 dataset=df,experiment_name="simple_experiment", model_name="simple_model", skip_save_processed_input=False)
8
10 frames
/usr/local/lib/python3.10/dist-packages/ludwig/utils/llm_utils.py in remove_left_padding(input_ids_sample, tokenizer)
35 bos_idx = 0
36
---> 37 input_ids_no_bos = input_ids_no_padding[bos_idx:].unsqueeze(0)
38 return input_ids_no_bos
IndexError: Dimension specified as 0 but tensor has no dimensions`
I am also attaching the notebook for your reference.
Ludwig_Few_Shot_Training (1).ipynb.zip

rdrgkggo

rdrgkggo4#

Hi @chayanray, I had a chance to take a look at your notebook and have a few questions:

  1. Are you intentionally setting the trainer type to finetune ? I just wanted to make sure this change was something you intentionally made. By doing this, you're no longer running few-shot learning, but are instead running LLM fine-tuning. If you want to do few-shot learning, you can just remove the entire trainer section from your config and things will work correctly. I was able to run it this way successfully using your notebook.
  2. If you are trying to fine-tune using the few-shot example, then I would actually make a slightly different change. Right now, fine-tuning is only supported for text-to-text models. So, I would change the config to the following:
model_type: llm
model_name: facebook/opt-350m
generation:
    temperature: 0.1
    top_p: 0.75
    top_k: 40
    num_beams: 4
    max_new_tokens: 64
prompt:
    task: "Classify the sample input as either negative, neutral, or positive."
    retrieval:
        type: semantic
        k: 3
        model_name: paraphrase-MiniLM-L3-v2
input_features:
-
    name: review
    type: text
output_features:
-
    name: label
    type: text
preprocessing:
    split:
        type: fixed
trainer:
    type: finetune
    epochs: 2

With this config and my latest branch, I was able to run fine-tuning successfully.
I will call out that our current LLM fine-tuning implementation has a few known issues, and I am going to be landing a fix for all of them by the end of the week. I'd keep a lookout for that, and also happy to follow up on this thread once that PR lands.
Let me know if this helps and if you're able to confirm that things work!

mf98qq94

mf98qq945#

当你可以确认时,请随时告知我们,这样我们就可以关闭这个问题了!

vwoqyblh

vwoqyblh6#

@arnavgarg1 :将输出类型从类别更改为文本可以使错误消失并进行训练。然而,预测步骤不再按预期工作。可能需要对此进行更深入的研究。

相关问题