PaddleNLP [Bug]: 请求填充,但分词器没有填充标记,请选择一个标记作为`pad_token`,

np8igboo  于 2个月前  发布在  其他
关注(0)|答案(5)|浏览(43)

软件环境

- paddlepaddle:develop
- paddlepaddle-gpu: develop 11.8
- paddlenlp:  lastest  4609d07a54ab97974b962b536dde7164ab15db93

重复问题

  • 我已搜索现有的问题

错误描述

meta-llama/Meta-Llama-3-8B-Instruct infer error

(…)nstruct/model-00004-of-00004.safetensors:  94%|▉| 1.10G/1.17G [00:13<00:00, 8�[A
(…)nstruct/model-00004-of-00004.safetensors:  95%|▉| 1.11G/1.17G [00:13<00:00, 7�[A
(…)nstruct/model-00004-of-00004.safetensors:  96%|▉| 1.12G/1.17G [00:13<00:00, 8�[A
(…)nstruct/model-00004-of-00004.safetensors:  97%|▉| 1.13G/1.17G [00:13<00:00, 7�[A
(…)nstruct/model-00004-of-00004.safetensors:  98%|▉| 1.14G/1.17G [00:14<00:00, 6�[A
(…)nstruct/model-00004-of-00004.safetensors: 100%|█| 1.17G/1.17G [00:14<00:00, 8�[A
Downloading shards: 100%|█████████████████████████| 4/4 [03:24<00:00, 51.14s/it]
W0613 23:29:27.245162 141364 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8
W0613 23:29:27.246907 141364 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
Loading checkpoint shards: 100%|██████████████████| 4/4 [03:39<00:00, 54.87s/it]
�[32m[2024-06-13 23:33:27,358] [    INFO]�[0m - All model checkpoint weights were used when initializing LlamaForCausalLM.
�[0m
�[32m[2024-06-13 23:33:27,359] [    INFO]�[0m - All the weights of LlamaForCausalLM were initialized from the model checkpoint at meta-llama/Meta-Llama-3-8B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.�[0m
(…)ama-3-8B-Instruct/generation_config.json: 100%|█| 126/126 [00:00<00:00, 489kB
�[32m[2024-06-13 23:33:27,486] [    INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/meta-llama/Meta-Llama-3-8B-Instruct/generation_config.json�[0m
�[32m[2024-06-13 23:33:27,487] [    INFO]�[0m - We are using <class 'paddlenlp.transformers.llama.configuration.LlamaConfig'> to load 'meta-llama/Meta-Llama-3-8B-Instruct'.�[0m
�[32m[2024-06-13 23:33:27,487] [    INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/meta-llama/Meta-Llama-3-8B-Instruct/config.json�[0m
�[32m[2024-06-13 23:33:27,488] [    INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/meta-llama/Meta-Llama-3-8B-Instruct/generation_config.json�[0m
�[32m[2024-06-13 23:33:27,490] [    INFO]�[0m - Start predict�[0m
�[31m[2024-06-13 23:33:27,491] [   ERROR]�[0m - Using pad_token, but it is not set yet.�[0m
Traceback (most recent call last):
  File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 1651, in <module>
    predict()
  File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 1596, in predict
    outputs = predictor.predict(batch_source_text)
  File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 251, in predict
    tokenized_source = self._preprocess(input_texts)
  File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 226, in _preprocess
    tokenized_source = self.tokenizer(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 2248, in __call__
    return self.batch_encode(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 2523, in batch_encode
    padding_strategy, truncation_strategy, max_length, kwargs = self._get_padding_truncation_strategies(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 2004, in _get_padding_truncation_strategies
    raise ValueError(
ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad token via `tokenizer.add_special_tokens({'pad_token': '[PAD]'})`.
### 稳定复现步骤 & 代码

!pip install tiktoken
!python predictor.py --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct --dtype=float16

kr98yfug

kr98yfug1#

使用未知标记,但尚未设置。在qwen2推理中也有相同的问题:

[33m[2024-06-13 23:21:55,506] [ WARNING]�[0m - if you run ring_flash_attention.py, please ensure you install the paddlenlp_ops by following the instructions provided at https://github.com/PaddlePaddle/PaddleNLP/blob/develop/csrc/README.md�[0m
�[32m[2024-06-13 23:21:56,948] [    INFO]�[0m - We are using <class 'paddlenlp.transformers.qwen2.tokenizer.Qwen2Tokenizer'> to load 'Qwen/Qwen2-7B-Instruct'.�[0m
�[31m[2024-06-13 23:21:57,310] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:21:57,310] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:21:57,310] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:21:57,310] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:21:57,310] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:21:57,310] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:21:57,310] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:21:57,310] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[32m[2024-06-13 23:21:57,311] [    INFO]�[0m - We are using <class 'paddlenlp.transformers.qwen2.configuration.Qwen2Config'> to load 'Qwen/Qwen2-7B-Instruct'.�[0m
�[32m[2024-06-13 23:21:57,311] [    INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/config.json�[0m
�[32m[2024-06-13 23:21:57,312] [    INFO]�[0m - We are using <class 'paddlenlp.transformers.qwen2.modeling.Qwen2ForCausalLM'> to load 'Qwen/Qwen2-7B-Instruct'.�[0m
�[32m[2024-06-13 23:21:57,312] [    INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/config.json�[0m
�[32m[2024-06-13 23:21:57,313] [    INFO]�[0m - Loading weights file from cache at /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/model.safetensors.index.json�[0m
Downloading shards: 100%|██████████████████████| 4/4 [00:00<00:00, 26255.42it/s]
W0613 23:21:57.318940 134458 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8
W0613 23:21:57.320240 134458 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
Loading checkpoint shards: 100%|██████████████████| 4/4 [03:18<00:00, 49.74s/it]
�[32m[2024-06-13 23:25:34,697] [    INFO]�[0m - All model checkpoint weights were used when initializing Qwen2ForCausalLM.
�[0m
�[32m[2024-06-13 23:25:34,697] [    INFO]�[0m - All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at Qwen/Qwen2-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.�[0m
�[32m[2024-06-13 23:25:34,700] [    INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/generation_config.json�[0m
�[32m[2024-06-13 23:25:34,700] [    INFO]�[0m - Generation config file not found, using a generation config created from the model config.�[0m
�[32m[2024-06-13 23:25:34,701] [    INFO]�[0m - We are using <class 'paddlenlp.transformers.qwen2.configuration.Qwen2Config'> to load 'Qwen/Qwen2-7B-Instruct'.�[0m
�[32m[2024-06-13 23:25:34,701] [    INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/config.json�[0m
�[32m[2024-06-13 23:25:34,701] [    INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/generation_config.json�[0m
�[33m[2024-06-13 23:25:34,702] [ WARNING]�[0m - Can't find generation config, so it will not use generation_config field in the model config�[0m
�[32m[2024-06-13 23:25:34,703] [    INFO]�[0m - Start predict�[0m
�[31m[2024-06-13 23:25:48,343] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,343] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,343] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,343] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,343] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,343] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,343] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,344] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,345] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,346] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,347] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,348] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,349] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,350] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,351] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,352] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,353] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,354] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,355] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,356] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,357] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,357] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,357] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,357] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,357] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,357] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
�[31m[2024-06-13 23:25:48,357] [   ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
Traceback (most recent call last):
  File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 1651, in <module>
    predict()
  File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 1596, in predict
    outputs = predictor.predict(batch_source_text)
  File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 253, in predict
    decoded_predictions = self._postprocess(predictions)
  File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 245, in _postprocess
    decoded_predictions = self.tokenizer.batch_decode(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 3200, in batch_decode
    return [
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 3201, in <listcomp>
    self.decode(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 3239, in decode
    return self._decode(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/qwen2/tokenizer.py", line 294, in _decode
    return super()._decode(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils.py", line 1842, in _decode
    sub_texts.append(self.convert_tokens_to_string(current_sub_text))
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/qwen2/tokenizer.py", line 280, in convert_tokens_to_string
    text = "".join(tokens)
TypeError: sequence item 196: expected str instance, NoneType found
qij5mzcb

qij5mzcb2#

Qwen修复PR: #8601
LLAMA修复PR: #8630

atmip9wb

atmip9wb3#

Qwen修复PR: #8601
LLaMA修复PR: #8630
#8630 并没有修复LLaMA的问题,我不知道如何修复。

uajslkp6

uajslkp64#

Qwen修复PR: #8601 LLaMA修复PR: #8630
#8630 并没有修复LLaMA的问题哈,我不知道咋修。
LLama3的tokenizer中缺少pad_token,修复代码如下,我看已经加上了

if (isinstance(tokenizer, LlamaTokenizer) or isinstance(tokenizer, Llama3Tokenizer)) and not tokenizer.pad_token:
        tokenizer.pad_token = tokenizer.unk_token
qgzx9mmu

qgzx9mmu5#

Qwen修复PR: #8601 LLaMA修复PR: #8630
#8630 并没有修复LLaMA的问题哈,我不知道怎么修。
LLama3的tokenizer中缺少pad_token,修复代码如下,我已经加上了

if (isinstance(tokenizer, LlamaTokenizer) or isinstance(tokenizer, Llama3Tokenizer)) and not tokenizer.pad_token:
        tokenizer.pad_token = tokenizer.unk_token

没用的,llama3连unk_token都set不了,这里的代码改动可以先忽略。我们根本上需要同时解决pad_token和unk_token的set问题才行,我猜测,其他token可能也同样无法set。

相关问题