当前环境
当我在客户端使用results = await asyncio.gather(*tasks)
时,我在vllm服务器上遇到了错误。
🐛 描述bug
这是错误日志:
ERROR 07-22 09:54:47 async_llm_engine.py:52] Engine background task failed
ERROR 07-22 09:54:47 async_llm_engine.py:52] Traceback (most recent call last):
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 42, in _log_task_completion
ERROR 07-22 09:54:47 async_llm_engine.py:52] return_value = task.result()
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 532, in run_engine_loop
ERROR 07-22 09:54:47 async_llm_engine.py:52] has_requests_in_progress = await asyncio.wait_for(
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
ERROR 07-22 09:54:47 async_llm_engine.py:52] return fut.result()
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 506, in engine_step
ERROR 07-22 09:54:47 async_llm_engine.py:52] request_outputs = await self.engine.step_async()
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 245, in step_async
ERROR 07-22 09:54:47 async_llm_engine.py:52] self.do_log_stats(scheduler_outputs, output)
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 804, in do_log_stats
ERROR 07-22 09:54:47 async_llm_engine.py:52] self.stat_logger.log(
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/metrics.py", line 322, in log
ERROR 07-22 09:54:47 async_llm_engine.py:52] self._log_prometheus(stats)
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/metrics.py", line 256, in _log_prometheus
ERROR 07-22 09:54:47 async_llm_engine.py:52] self._log_counter(self.metrics.counter_generation_tokens,
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/metrics.py", line 288, in _log_counter
ERROR 07-22 09:54:47 async_llm_engine.py:52] counter.labels(**self.labels).inc(data)
ERROR 07-22 09:54:47 async_llm_engine.py:52] File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/prometheus_client/metrics.py", line 313, in inc
ERROR 07-22 09:54:47 async_llm_engine.py:52] raise ValueError('Counters can only be incremented by non-negative amounts.')
ERROR 07-22 09:54:47 async_llm_engine.py:52] ValueError: Counters can only be incremented by non-negative amounts.
Exception in callback functools.partial(<function _log_task_completion at 0x7f9e82a6e7a0>, error_callback=<bound method AsyncLLMEngine._error_callback of <vllm.engine.async_llm_engine.AsyncLLMEngine object at 0x7f9e7dfea380>>)
handle: <Handle functools.partial(<function _log_task_completion at 0x7f9e82a6e7a0>, error_callback=<bound method AsyncLLMEngine._error_callback of <vllm.engine.async_llm_engine.AsyncLLMEngine object at 0x7f9e7dfea380>>)>
Traceback (most recent call last):
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 42, in _log_task_completion
return_value = task.result()
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 532, in run_engine_loop
has_requests_in_progress = await asyncio.wait_for(
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 506, in engine_step
request_outputs = await self.engine.step_async()
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 245, in step_async
self.do_log_stats(scheduler_outputs, output)
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 804, in do_log_stats
self.stat_logger.log(
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/metrics.py", line 322, in log
self._log_prometheus(stats)
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/metrics.py", line 256, in _log_prometheus
self._log_counter(self.metrics.counter_generation_tokens,
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/metrics.py", line 288, in _log_counter
counter.labels(**self.labels).inc(data)
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/prometheus_client/metrics.py", line 313, in inc
raise ValueError('Counters can only be incremented by non-negative amounts.')
ValueError: Counters can only be incremented by non-negative amounts.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
File "/home/a100user/miniconda3/envs/glm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 54, in _log_task_completion
raise AsyncEngineDeadError(
vllm.engine.async_llm_engine.AsyncEngineDeadError: Task finished unexpectedly. This should never happen! Please open an issue on Github. See stack trace above for theactual cause.
2条答案
按热度按时间ctzwtxfj1#
我正在收集任务时遇到了相同的错误。
iswrvxsc2#
我的完整错误信息是: