vllm 运行时异常[步骤必须是非零]

fdbelqdn  于 6个月前  发布在  其他
关注(0)|答案(4)|浏览(101)

在这段代码中,max_prompt_len 可能是 0:
vllm/vllm/worker/model_runner.py
第 232 行 264017a
| | start_loc_tensor=torch.arange(0, |

|   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 29, in _raise_exception_on_finish           [32/1990]
    |     task.result()                                                                                                                           
    |   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 411, in run_engine_loop                              
    |     has_requests_in_progress = await self.engine_step()                                                                                     
    |   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 390, in engine_step                                  
    |     request_outputs = await self.engine.step_async()                                                                                        
    |   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 189, in step_async                                   
    |     all_outputs = await self._run_workers_async(                                                                                            
    |   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 276, in _run_workers_async                           
    |     all_outputs = await asyncio.gather(*coros)                                                                                              
    |   File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run                                                                  
    |     result = self.fn(*self.args, **self.kwargs)                                                                                             
    |   File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context                                  
    |     return func(*args, **kwargs)                                                                                                            
    |   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 225, in execute_model                                          
    |     output = self.model_runner.execute_model(seq_group_metadata_list,                                                                       
    |   File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context                                  
    |     return func(*args, **kwargs)                                                                                                            
    |   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 553, in execute_model                                    
    |     lora_mapping) = self.prepare_input_tensors(seq_group_metadata_list)                                                                     
    |   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 473, in prepare_input_tensors                            
    |     lora_requests) = self._prepare_prompt(seq_group_metadata_list)                                                                          
    |   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 232, in _prepare_prompt                                  
    |     start_loc_tensor = torch.arange(0,                                                                                                      
    | RuntimeError: step must be nonzero
vsnjm48y

vsnjm48y1#

我遇到了同样的问题。
模型:qwen-72b-chat-int4
语言模型:0.3.1

zd287kbt

zd287kbt2#

我解决了。因为我不小心传入了一个空的提示符。

zxlwwiss

zxlwwiss3#

我可以确认这个问题即使输入不为零也存在。
以下是我的有效载荷:
@WoosukKwon

curl -X 'POST' \
  'https://xxxxxxxxxxxxx.net/v1/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "test/7b",
  "prompt": "abc",
  "max_tokens": 16,
  "temperature": 1,
  "top_p": 0.36,
  "stream": false,
  "top_k": 20,
  "ignore_eos": false,
  "use_beam_search": false,
  "stop_token_ids": [
    0
  ],
  "skip_special_tokens": true,
  "spaces_between_special_tokens": true,
  "repetition_penalty": 1,
  "min_p": 0,
  "include_stop_str_in_output": false,
  "length_penalty": 1
}'

相关问题