text-generation-inference 当使用工具调用时,OpenAI兼容API与原始OpenAI API存在差异,

pkbketx9  于 2个月前  发布在  其他
关注(0)|答案(2)|浏览(38)

系统信息
最近,我一直尝试在Jupyter Notebook中使用Docker 2.0.4通过Qwen2 72b。我尝试使用LangChain langchain_openai.chat_models.ChatOpenAI客户端和自定义工具。完成请求成功,但是当LangChain解析响应时,我得到了一个错误。经过一点挖掘,我发现TGI和OpenAI在涉及工具调用时以不同的格式发送响应。
当我向OpenAI发送带有工具的POST请求时,我得到了json响应:

...
"tool_calls": [
          {
            "id": "0",
            "type": "function",
            "function": {
              "description": null,
              "name": "retrieve_payment_status",
              "arguments": "{\"transaction_id\": \"T1001\"}"
            }
...

这是我从我的TGI服务得到的:

...
"tool_calls": [
          {
            "id": "0",
            "type": "function",
            "function": {
              "description": null,
              "name": "retrieve_payment_status",
              "arguments": {
                "transaction_id": "T1001"
              }
            }
...

显然,OpenAI将tool_calls["function"]["arguments"]字段格式化为JSON字符串,而TGI将其解析为有效的JSON对象。我知道这可能是OpenAI的错误,但这种差异使得LangChain客户端(也与OpenAI兼容)在解析TGI响应时引发错误。

信息

  • Docker
  • CLI直接

任务

  • 一个官方支持的命令
  • 我自己的修改

重现

引发错误的源代码:

from langchain.tools import BaseTool

class RetrievePaymentStatusSchema(BaseModel):
    transaction_id: str = Field(desc='The transaction id.')
    

class RetrievePaymentStatus(BaseTool):
    name: str = 'retrieve_payment_status'
    description: str = 'Get payment status of a transaction'
    args_schema: Type[BaseModel] = RetrievePaymentStatusSchema
    
    def __init__(self):
        super().__init__()
    
    def _run(self, transaction_id: str) -> str:
        return 'PENDING'

retrieve_payment_status_tool = RetrievePaymentStatus()

llm = ChatOpenAI(
    model='MIXTRAL',
    base_url='http://10.244.3.28:8080/v1',
    api_key='aboba',
    max_tokens=500,
)

llm_with_tools = llm.bind_tools([retrieve_payment_status_tool], tool_choice="auto")
llm_with_tools.invoke(input="What's the status of my transaction T1001?")

错误:

ValidationError: 1 validation error for AIMessage
invalid_tool_calls -> 0 -> args
  str type expected (type=type_error.str)

预期行为

我想tool_calls也应该是这样的:

...
"tool_calls": [
          {
            "id": "0",
            "type": "function",
            "function": {
              "description": null,
              "name": "retrieve_payment_status",
              "arguments": "{\"transaction_id\": \"T1001\"}"
            }
...

更新: 我想同样的道理也适用于流式传输响应。在这种情况下,OpenAI逐渐(逐个令牌)用正在调用的工具的实际参数填充response["tool_calls"]["function"]["arguments"],而TGI将整个模型响应放入此字段。奇怪的是,当我使用流式传输时,上面提到的原始问题并不会出现。

为了说明我刚才所说的内容,以下是我得到的结果,如果我将上述示例中的所有收集到的响应块连接起来:

{
  "function": {
    "_name": "retrieve_payment_status",
    "transaction_id": "T1001"
  }
}<|im_end|>

而这是我在使用实际的OpenAI API时得到的结果:

{"transaction_id":"T1001"}
8mmmxcuj

8mmmxcuj1#

你好!我确认,llama3 8b也存在同样的问题。
暂时通过将有效的JSON参数转换为JSON字符串的hack方法解决。但是,我希望能够遵循OpenAI协议。
我的肮脏hack:
方法_generate OpenAI:

def _generate(
        self,
        messages: List[BaseMessage],
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> ChatResult:
        if self.streaming:
            stream_iter = self._stream(
                messages, stop=stop, run_manager=run_manager, **kwargs
            )
            return generate_from_stream(stream_iter)
        message_dicts, params = self._create_message_dicts(messages, stop)
        params = {**params, **kwargs}

        last_message = message_dicts[-1]
        if last_message["role"] == 'tool':
            if 'tool_choice' in params:
                removed_value = params.pop('tool_choice', None)
                print(f"Из параметров удален параметр 'tool_choice' со значением '{removed_value}'")

        response = normalize_chat_complition(self.client.create(messages=message_dicts, **params))
        return self._create_chat_result(response)

我的转换(请不要因为我的糟糕代码而责备我):

def normalize_chat_complition(response: ChatCompletion) -> Any:
    """
    Нормализация ответа от HuggingFace Text Generation Inference к формату ответа OpenAI:
        - tool_call.function.arguments приводятся к строке формата JSON,
        - для choice.message.content задается значение " ", если пришло 'null'.

    Args:
        response (ChatCompletion): Ответ от HuggingFace Text Generation Inference.

    Return (ChatCompletion):
        Нормализованный ответ от HuggingFace Text Generation Inference.
    """
    choices = []
    for choice in response.choices:
        if choice.message.tool_calls :
            tool_calls = []
            for tool_call in choice.message.tool_calls:
                tool_call.function.arguments = json.dumps(tool_call.function.arguments)
                tool_calls.append(tool_call)
            choice.message.tool_calls = tool_calls
        if not choice.message.content:
            choice.message.content = " "
        choices.append(choice)
    return ChatCompletion(
        id=response.id,
        choices=choices,
        created=response.created,
        model=response.model,
        object="chat.completion",
        system_fingerprint=response.system_fingerprint,
        usage=response.usage
        )

这段代码消除了方法调用循环:

last_message = message_dicts[-1]
        if last_message["role"] == 'tool':
            if 'tool_choice' in params:
                removed_value = params.pop('tool_choice', None)
                print(f"Из параметров удален параметр 'tool_choice' со значением '{removed_value}'")
2jcobegt

2jcobegt2#

我通过创建一个代理服务暂时解决了这个问题,该代理服务捕获了TGI模型的响应并重新格式化以匹配OpenAI API。但你的解决方案看起来也很整洁 :)

相关问题