在vLLM v0.3.0版本(/v1/completions)中，回声参数和请求日志似乎存在一些问题,

bogh5gae 于 6个月前发布在其他

关注(0)|答案(8)|浏览(76)

在vLLM v0.3.0版本中，logprobs似乎已经过期。我个人猜测这可能是由于Pydantic版本的问题。
在Pydantic2版本中，使用model_ dump()方法似乎无法很好地处理logprops。
我已经打印了接口，发现使用dict()返回的信息是正确的，但无法将其转换为JSON格式。
此外，由于openai服务器的重构，接口日志中的prompt参数始终为None。
vllm/vllm/entrypoints/openai/serving_completion.py##line 298
vllm/vllm/engine/async_llm_engine.py##line 432

vllm

来源：https://github.com/vllm-project/vllm/issues/2706

8条答案

按热度按时间

uemypmqf1#

关于提示的记录，我可以修复它。
关于LogProb,你有要求复现吗？我在0.3.0版本上得到了正确格式化的logprob。

completion = await client.completions.create(
        model=MODEL_NAME,
        prompt="Hello, my name is",
        max_tokens=5,
        temperature=0.0,
        logprobs=True,
    )

Logprobs(text_offset=[0, 6, 10, 12, 15], token_logprobs=[-3.7315573692321777, -1.002060055732727, -0.12878480553627014, -0.8721659183502197, -0.7809031009674072], tokens=['▁Sarah', '▁and', '▁I', '▁am', '▁a'], top_logprobs=[{'▁Sarah': -3.7315573692321777}, {'▁and': -1.002060055732727}, {'▁I': -0.12878480553627014}, {'▁am': -0.8721659183502197}, {'▁a': -0.7809031009674072}])

赞(0）回复(0）举报 6个月前

5tmbdcev2#

回声参数可能也存在一些问题。

INFO 02-02 09:05:41 async_llm_engine.py:431] Received request cmpl-e3ab58bfc7834026a608eee91a74dc15-0: prompt: None, prefix_pos: None,sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=1.0, top_p=1.0, top_k=-1, min_p=0.0, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=5, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True), prompt token ids: [9707, 11, 847, 829, 374], lora_request: None.
INFO 02-02 09:05:42 async_llm_engine.py:110] Finished request cmpl-e3ab58bfc7834026a608eee91a74dc15-0.
None
None
INFO:     127.0.0.1:46374 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/aioprometheus/asgi/middleware.py", line 184, in __call__
    await self.asgi_callable(scope, receive, wrapped_send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 762, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 782, in app
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 72, in app
    response = await func(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 299, in app
    raise e
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 294, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 163, in create_completion
    generator = await openai_serving_completion.create_completion(
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_completion.py", line 334, in create_completion
    response = request_output_to_completion_response(
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_completion.py", line 177, in request_output_to_completion_response
    top_logprobs = prompt_logprobs + output.logprobs
TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'

vllm/entrypoints/openai/serving_completion.py"# line 177

top_logprobs = prompt_logprobs + output.logprobs

赞(0）回复(0）举报 6个月前

lztngnrs3#

logprobs的功能是什么？

赞(0）回复(0）举报 6个月前

w9apscun4#

logprobs的功能是什么？
根据描述，logprobs表示打印候选词的对数概率。

赞(0）回复(0）举报 6个月前

llmtgqce5#

这个问题已经解决了，因为我使用了qwen模型，并且在huggingface的tokenization_qwen.py文件中的convert_id_to_token方法没有被转换成字符串。但是它被转换成了字节以进行处理。修改文件使得参数生效。

示例：

$x_1a_0b_1x \rightarrow x_1a_1b_1x$

赞(0）回复(0）举报 6个月前

qltillow6#

日志概率问题已经解决，因为我使用了qwen模型，huggingface上的tokenization_qwen.py文件中的convert_id_to_token方法没有转换为字符串。但它被转换为字节进行处理。修改文件以使参数生效。
示例：

def _convert_id_to_token(self, index: int) -> Union[bytes, str]:
        """Converts an id to a token, special tokens included"""
        if index in self.decoder:
            return self.decoder[index]
        raise ValueError("unknown ids")

def _convert_id_to_token(self, index: int) -> Union[bytes, str]:
        """Converts an id to a token, special tokens included"""
        if index in self.decoder:
            return self.decoder[index].decode("utf-8", errors=self.errors)
        raise ValueError("unknown ids")

@simon-mo 日志概率参数已解决，这是一个分词器问题。
"此外，由于openai服务器的重构，接口日志中的prompt参数始终为None。
vllm/vllm/entrypoints/openai/serving_completion.py##line 298
vllm/vllm/engine/async_llm_engine.py##line 432"
我认为echo参数也受到这个地方的影响。
vllm/vllm/entrypoints/openai/serving_completion.py##line 298
vllm/vllm/entrypoints/openai/serving_completion.py##line 177,我觉得可以添加一些保护

top_logprobs = prompt_logprobs if prompt_logprobs else [] + output.logprobs if output.logprobs else []
 output_text = prompt_text if prompt_text else "" + output.text if output.text else ""

赞(0）回复(0）举报 6个月前

a1o7rhls7#

任何修复或PR都不胜感激。

赞(0）回复(0）举报 6个月前

u3r8eeie8#

我遇到了相同的问题。我已经创建了这个PR #2995,它应该涵盖回声和logprobs。

赞(0）回复(0）举报 6个月前