vllm 运行时异常[步骤必须是非零]

fdbelqdn  于 10个月前  发布在  其他
关注(0)|答案(4)|浏览(124)

在这段代码中,max_prompt_len 可能是 0:
vllm/vllm/worker/model_runner.py
第 232 行 264017a
| | start_loc_tensor=torch.arange(0, |

  1. | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 29, in _raise_exception_on_finish [32/1990]
  2. | task.result()
  3. | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 411, in run_engine_loop
  4. | has_requests_in_progress = await self.engine_step()
  5. | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 390, in engine_step
  6. | request_outputs = await self.engine.step_async()
  7. | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 189, in step_async
  8. | all_outputs = await self._run_workers_async(
  9. | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 276, in _run_workers_async
  10. | all_outputs = await asyncio.gather(*coros)
  11. | File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
  12. | result = self.fn(*self.args, **self.kwargs)
  13. | File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
  14. | return func(*args, **kwargs)
  15. | File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 225, in execute_model
  16. | output = self.model_runner.execute_model(seq_group_metadata_list,
  17. | File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
  18. | return func(*args, **kwargs)
  19. | File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 553, in execute_model
  20. | lora_mapping) = self.prepare_input_tensors(seq_group_metadata_list)
  21. | File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 473, in prepare_input_tensors
  22. | lora_requests) = self._prepare_prompt(seq_group_metadata_list)
  23. | File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 232, in _prepare_prompt
  24. | start_loc_tensor = torch.arange(0,
  25. | RuntimeError: step must be nonzero
vsnjm48y

vsnjm48y1#

我遇到了同样的问题。
模型:qwen-72b-chat-int4
语言模型:0.3.1

zd287kbt

zd287kbt2#

我解决了。因为我不小心传入了一个空的提示符。

zxlwwiss

zxlwwiss3#

我可以确认这个问题即使输入不为零也存在。
以下是我的有效载荷:
@WoosukKwon

  1. curl -X 'POST' \
  2. 'https://xxxxxxxxxxxxx.net/v1/completions' \
  3. -H 'accept: application/json' \
  4. -H 'Content-Type: application/json' \
  5. -d '{
  6. "model": "test/7b",
  7. "prompt": "abc",
  8. "max_tokens": 16,
  9. "temperature": 1,
  10. "top_p": 0.36,
  11. "stream": false,
  12. "top_k": 20,
  13. "ignore_eos": false,
  14. "use_beam_search": false,
  15. "stop_token_ids": [
  16. 0
  17. ],
  18. "skip_special_tokens": true,
  19. "spaces_between_special_tokens": true,
  20. "repetition_penalty": 1,
  21. "min_p": 0,
  22. "include_stop_str_in_output": false,
  23. "length_penalty": 1
  24. }'
展开查看全部
emeijp43

emeijp434#

+1

相关问题