PaddleNLP [Bug]: 请求填充,但分词器没有填充标记,请选择一个标记作为`pad_token`,

np8igboo  于 10个月前  发布在  其他
关注(0)|答案(5)|浏览(94)

软件环境

  1. - paddlepaddle:develop
  2. - paddlepaddle-gpu: develop 11.8
  3. - paddlenlp: lastest 4609d07a54ab97974b962b536dde7164ab15db93

重复问题

  • 我已搜索现有的问题

错误描述

  1. meta-llama/Meta-Llama-3-8B-Instruct infer error
  2. (…)nstruct/model-00004-of-00004.safetensors: 94%|▉| 1.10G/1.17G [00:13<00:00, 8�[A
  3. (…)nstruct/model-00004-of-00004.safetensors: 95%|▉| 1.11G/1.17G [00:13<00:00, 7�[A
  4. (…)nstruct/model-00004-of-00004.safetensors: 96%|▉| 1.12G/1.17G [00:13<00:00, 8�[A
  5. (…)nstruct/model-00004-of-00004.safetensors: 97%|▉| 1.13G/1.17G [00:13<00:00, 7�[A
  6. (…)nstruct/model-00004-of-00004.safetensors: 98%|▉| 1.14G/1.17G [00:14<00:00, 6�[A
  7. (…)nstruct/model-00004-of-00004.safetensors: 100%|█| 1.17G/1.17G [00:14<00:00, 8�[A
  8. Downloading shards: 100%|█████████████████████████| 4/4 [03:24<00:00, 51.14s/it]
  9. W0613 23:29:27.245162 141364 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8
  10. W0613 23:29:27.246907 141364 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
  11. Loading checkpoint shards: 100%|██████████████████| 4/4 [03:39<00:00, 54.87s/it]
  12. �[32m[2024-06-13 23:33:27,358] [ INFO]�[0m - All model checkpoint weights were used when initializing LlamaForCausalLM.
  13. �[0m
  14. �[32m[2024-06-13 23:33:27,359] [ INFO]�[0m - All the weights of LlamaForCausalLM were initialized from the model checkpoint at meta-llama/Meta-Llama-3-8B-Instruct.
  15. If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.�[0m
  16. (…)ama-3-8B-Instruct/generation_config.json: 100%|█| 126/126 [00:00<00:00, 489kB
  17. �[32m[2024-06-13 23:33:27,486] [ INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/meta-llama/Meta-Llama-3-8B-Instruct/generation_config.json�[0m
  18. �[32m[2024-06-13 23:33:27,487] [ INFO]�[0m - We are using <class 'paddlenlp.transformers.llama.configuration.LlamaConfig'> to load 'meta-llama/Meta-Llama-3-8B-Instruct'.�[0m
  19. �[32m[2024-06-13 23:33:27,487] [ INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/meta-llama/Meta-Llama-3-8B-Instruct/config.json�[0m
  20. �[32m[2024-06-13 23:33:27,488] [ INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/meta-llama/Meta-Llama-3-8B-Instruct/generation_config.json�[0m
  21. �[32m[2024-06-13 23:33:27,490] [ INFO]�[0m - Start predict�[0m
  22. �[31m[2024-06-13 23:33:27,491] [ ERROR]�[0m - Using pad_token, but it is not set yet.�[0m
  23. Traceback (most recent call last):
  24. File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 1651, in <module>
  25. predict()
  26. File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 1596, in predict
  27. outputs = predictor.predict(batch_source_text)
  28. File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 251, in predict
  29. tokenized_source = self._preprocess(input_texts)
  30. File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 226, in _preprocess
  31. tokenized_source = self.tokenizer(
  32. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 2248, in __call__
  33. return self.batch_encode(
  34. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 2523, in batch_encode
  35. padding_strategy, truncation_strategy, max_length, kwargs = self._get_padding_truncation_strategies(
  36. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 2004, in _get_padding_truncation_strategies
  37. raise ValueError(
  38. ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad token via `tokenizer.add_special_tokens({'pad_token': '[PAD]'})`.
  1. ### 稳定复现步骤 & 代码

!pip install tiktoken
!python predictor.py --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct --dtype=float16

kr98yfug

kr98yfug1#

使用未知标记,但尚未设置。在qwen2推理中也有相同的问题:

  1. [33m[2024-06-13 23:21:55,506] [ WARNING]�[0m - if you run ring_flash_attention.py, please ensure you install the paddlenlp_ops by following the instructions provided at https://github.com/PaddlePaddle/PaddleNLP/blob/develop/csrc/README.md�[0m
  2. �[32m[2024-06-13 23:21:56,948] [ INFO]�[0m - We are using <class 'paddlenlp.transformers.qwen2.tokenizer.Qwen2Tokenizer'> to load 'Qwen/Qwen2-7B-Instruct'.�[0m
  3. �[31m[2024-06-13 23:21:57,310] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  4. �[31m[2024-06-13 23:21:57,310] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  5. �[31m[2024-06-13 23:21:57,310] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  6. �[31m[2024-06-13 23:21:57,310] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  7. �[31m[2024-06-13 23:21:57,310] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  8. �[31m[2024-06-13 23:21:57,310] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  9. �[31m[2024-06-13 23:21:57,310] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  10. �[31m[2024-06-13 23:21:57,310] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  11. �[32m[2024-06-13 23:21:57,311] [ INFO]�[0m - We are using <class 'paddlenlp.transformers.qwen2.configuration.Qwen2Config'> to load 'Qwen/Qwen2-7B-Instruct'.�[0m
  12. �[32m[2024-06-13 23:21:57,311] [ INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/config.json�[0m
  13. �[32m[2024-06-13 23:21:57,312] [ INFO]�[0m - We are using <class 'paddlenlp.transformers.qwen2.modeling.Qwen2ForCausalLM'> to load 'Qwen/Qwen2-7B-Instruct'.�[0m
  14. �[32m[2024-06-13 23:21:57,312] [ INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/config.json�[0m
  15. �[32m[2024-06-13 23:21:57,313] [ INFO]�[0m - Loading weights file from cache at /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/model.safetensors.index.json�[0m
  16. Downloading shards: 100%|██████████████████████| 4/4 [00:00<00:00, 26255.42it/s]
  17. W0613 23:21:57.318940 134458 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8
  18. W0613 23:21:57.320240 134458 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
  19. Loading checkpoint shards: 100%|██████████████████| 4/4 [03:18<00:00, 49.74s/it]
  20. �[32m[2024-06-13 23:25:34,697] [ INFO]�[0m - All model checkpoint weights were used when initializing Qwen2ForCausalLM.
  21. �[0m
  22. �[32m[2024-06-13 23:25:34,697] [ INFO]�[0m - All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at Qwen/Qwen2-7B-Instruct.
  23. If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.�[0m
  24. �[32m[2024-06-13 23:25:34,700] [ INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/generation_config.json�[0m
  25. �[32m[2024-06-13 23:25:34,700] [ INFO]�[0m - Generation config file not found, using a generation config created from the model config.�[0m
  26. �[32m[2024-06-13 23:25:34,701] [ INFO]�[0m - We are using <class 'paddlenlp.transformers.qwen2.configuration.Qwen2Config'> to load 'Qwen/Qwen2-7B-Instruct'.�[0m
  27. �[32m[2024-06-13 23:25:34,701] [ INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/config.json�[0m
  28. �[32m[2024-06-13 23:25:34,701] [ INFO]�[0m - Loading configuration file /home/aistudio/.paddlenlp/models/Qwen/Qwen2-7B-Instruct/generation_config.json�[0m
  29. �[33m[2024-06-13 23:25:34,702] [ WARNING]�[0m - Can't find generation config, so it will not use generation_config field in the model config�[0m
  30. �[32m[2024-06-13 23:25:34,703] [ INFO]�[0m - Start predict�[0m
  31. �[31m[2024-06-13 23:25:48,343] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  32. �[31m[2024-06-13 23:25:48,343] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  33. �[31m[2024-06-13 23:25:48,343] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  34. �[31m[2024-06-13 23:25:48,343] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  35. �[31m[2024-06-13 23:25:48,343] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  36. �[31m[2024-06-13 23:25:48,343] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  37. �[31m[2024-06-13 23:25:48,343] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  38. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  39. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  40. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  41. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  42. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  43. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  44. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  45. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  46. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  47. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  48. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  49. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  50. �[31m[2024-06-13 23:25:48,344] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  51. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  52. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  53. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  54. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  55. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  56. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  57. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  58. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  59. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  60. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  61. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  62. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  63. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  64. �[31m[2024-06-13 23:25:48,345] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  65. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  66. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  67. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  68. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  69. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  70. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  71. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  72. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  73. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  74. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  75. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  76. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  77. �[31m[2024-06-13 23:25:48,346] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  78. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  79. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  80. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  81. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  82. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  83. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  84. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  85. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  86. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  87. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  88. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  89. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  90. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  91. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  92. �[31m[2024-06-13 23:25:48,347] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  93. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  94. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  95. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  96. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  97. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  98. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  99. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  100. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  101. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  102. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  103. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  104. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  105. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  106. �[31m[2024-06-13 23:25:48,348] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  107. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  108. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  109. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  110. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  111. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  112. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  113. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  114. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  115. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  116. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  117. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  118. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  119. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  120. �[31m[2024-06-13 23:25:48,349] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  121. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  122. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  123. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  124. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  125. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  126. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  127. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  128. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  129. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  130. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  131. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  132. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  133. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  134. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  135. �[31m[2024-06-13 23:25:48,350] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  136. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  137. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  138. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  139. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  140. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  141. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  142. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  143. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  144. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  145. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  146. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  147. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  148. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  149. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  150. �[31m[2024-06-13 23:25:48,351] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  151. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  152. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  153. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  154. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  155. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  156. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  157. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  158. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  159. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  160. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  161. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  162. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  163. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  164. �[31m[2024-06-13 23:25:48,352] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  165. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  166. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  167. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  168. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  169. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  170. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  171. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  172. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  173. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  174. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  175. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  176. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  177. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  178. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  179. �[31m[2024-06-13 23:25:48,353] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  180. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  181. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  182. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  183. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  184. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  185. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  186. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  187. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  188. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  189. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  190. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  191. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  192. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  193. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  194. �[31m[2024-06-13 23:25:48,354] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  195. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  196. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  197. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  198. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  199. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  200. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  201. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  202. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  203. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  204. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  205. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  206. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  207. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  208. �[31m[2024-06-13 23:25:48,355] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  209. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  210. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  211. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  212. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  213. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  214. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  215. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  216. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  217. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  218. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  219. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  220. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  221. �[31m[2024-06-13 23:25:48,356] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  222. �[31m[2024-06-13 23:25:48,357] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  223. �[31m[2024-06-13 23:25:48,357] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  224. �[31m[2024-06-13 23:25:48,357] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  225. �[31m[2024-06-13 23:25:48,357] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  226. �[31m[2024-06-13 23:25:48,357] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  227. �[31m[2024-06-13 23:25:48,357] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  228. �[31m[2024-06-13 23:25:48,357] [ ERROR]�[0m - Using unk_token, but it is not set yet.�[0m
  229. Traceback (most recent call last):
  230. File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 1651, in <module>
  231. predict()
  232. File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 1596, in predict
  233. outputs = predictor.predict(batch_source_text)
  234. File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 253, in predict
  235. decoded_predictions = self._postprocess(predictions)
  236. File "/home/aistudio/work/PaddleNLP/llm/predictor.py", line 245, in _postprocess
  237. decoded_predictions = self.tokenizer.batch_decode(
  238. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 3200, in batch_decode
  239. return [
  240. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 3201, in <listcomp>
  241. self.decode(
  242. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 3239, in decode
  243. return self._decode(
  244. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/qwen2/tokenizer.py", line 294, in _decode
  245. return super()._decode(
  246. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/tokenizer_utils.py", line 1842, in _decode
  247. sub_texts.append(self.convert_tokens_to_string(current_sub_text))
  248. File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/qwen2/tokenizer.py", line 280, in convert_tokens_to_string
  249. text = "".join(tokens)
  250. TypeError: sequence item 196: expected str instance, NoneType found
展开查看全部
qij5mzcb

qij5mzcb2#

Qwen修复PR: #8601
LLAMA修复PR: #8630

atmip9wb

atmip9wb3#

Qwen修复PR: #8601
LLaMA修复PR: #8630
#8630 并没有修复LLaMA的问题,我不知道如何修复。

uajslkp6

uajslkp64#

Qwen修复PR: #8601 LLaMA修复PR: #8630
#8630 并没有修复LLaMA的问题哈,我不知道咋修。
LLama3的tokenizer中缺少pad_token,修复代码如下,我看已经加上了

  1. if (isinstance(tokenizer, LlamaTokenizer) or isinstance(tokenizer, Llama3Tokenizer)) and not tokenizer.pad_token:
  2. tokenizer.pad_token = tokenizer.unk_token
qgzx9mmu

qgzx9mmu5#

Qwen修复PR: #8601 LLaMA修复PR: #8630
#8630 并没有修复LLaMA的问题哈,我不知道怎么修。
LLama3的tokenizer中缺少pad_token,修复代码如下,我已经加上了

  1. if (isinstance(tokenizer, LlamaTokenizer) or isinstance(tokenizer, Llama3Tokenizer)) and not tokenizer.pad_token:
  2. tokenizer.pad_token = tokenizer.unk_token

没用的,llama3连unk_token都set不了,这里的代码改动可以先忽略。我们根本上需要同时解决pad_token和unk_token的set问题才行,我猜测,其他token可能也同样无法set。

相关问题