text-generation-inference 错误 "解析对象时遇到文件结束符..." 与 tool_calls

mi7gmzs6  于 2个月前  发布在  其他
关注(0)|答案(6)|浏览(43)

系统信息

你好!
非常感谢你的产品,非常有帮助!

系统信息:

2024-06-30T00:30:49.387947Z  INFO text_generation_launcher: Runtime environment:
Target: x86_64-unknown-linux-gnu
Cargo version: 1.79.0
Commit sha: 192d49af0bfa71e886c27856232031f3935628ff
Docker label: sha-192d49a
nvidia-smi:
Sun Jun 30 00:30:47 2024       
   +-----------------------------------------------------------------------------------------+
   | NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
   |-----------------------------------------+------------------------+----------------------+
   | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
   | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
   |                                         |                        |               MIG M. |
   |=========================================+========================+======================|
   |   0  NVIDIA A100-SXM4-80GB          Off |   00000000:8B:00.0 Off |                    0 |
   | N/A   26C    P0             59W /  500W |       3MiB /  81920MiB |      0%      Default |
   |                                         |                        |             Disabled |
   +-----------------------------------------+------------------------+----------------------+
   |   1  NVIDIA A100-SXM4-80GB          Off |   00000000:8C:00.0 Off |                    0 |
   | N/A   29C    P0             62W /  500W |       3MiB /  81920MiB |      0%      Default |
   |                                         |                        |             Disabled |
   +-----------------------------------------+------------------------+----------------------+
   |   2  NVIDIA A100-SXM4-80GB          Off |   00000000:8D:00.0 Off |                    0 |
   | N/A   29C    P0             65W /  500W |       3MiB /  81920MiB |      0%      Default |
   |                                         |                        |             Disabled |
   +-----------------------------------------+------------------------+----------------------+
   |   3  NVIDIA A100-SXM4-80GB          Off |   00000000:8E:00.0 Off |                    0 |
   | N/A   28C    P0             60W /  500W |       3MiB /  81920MiB |      0%      Default |
   |                                         |                        |             Disabled |
   +-----------------------------------------+------------------------+----------------------+
                                                                                            
   +-----------------------------------------------------------------------------------------+
   | Processes:                                                                              |
   |  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
   |        ID   ID                                                               Usage      |
   |=========================================================================================|
   |  No running processes found                                                             |
   +-----------------------------------------------------------------------------------------+
xpu-smi:
N/A
2024-06-30T00:30:49.387995Z  INFO text_generation_launcher: Args {
    model_id: "/meta-llama/Meta-Llama-3-8B-Instruct",
    revision: None,
    validation_workers: 2,
    sharded: None,
    num_shard: None,
    quantize: None,
    speculate: None,
    dtype: None,
    trust_remote_code: false,
    max_concurrent_requests: 128,
    max_best_of: 2,
    max_stop_sequences: 4,
    max_top_n_tokens: 50,
    max_input_tokens: Some(
        8191,
    ),
    max_input_length: None,
    max_total_tokens: Some(
        8192,
    ),
    waiting_served_ratio: 0.3,
    max_batch_prefill_tokens: Some(
        8242,
    ),
    max_batch_total_tokens: None,
    max_waiting_tokens: 20,
    max_batch_size: None,
    cuda_graphs: None,
    hostname: "48eb07d0d604",
    port: 80,
    shard_uds_path: "/tmp/text-generation-server",
    master_addr: "localhost",
    master_port: 29500,
    huggingface_hub_cache: Some(
        "/data",
    ),
    weights_cache_override: None,
    disable_custom_kernels: false,
    cuda_memory_fraction: 1.0,
    rope_scaling: None,
    rope_factor: None,
    json_output: false,
    otlp_endpoint: None,
    otlp_service_name: "text-generation-inference.router",
    cors_allow_origin: [],
    watermark_gamma: None,
    watermark_delta: None,
    ngrok: false,
    ngrok_authtoken: None,
    ngrok_edge: None,
    tokenizer_config_path: None,
    disable_grammar_support: false,
    env: true,
    max_client_batch_size: 4,
    lora_adapters: None,
}

模型信息:

{
    "model_id": "/meta-llama/Meta-Llama-3-8B-Instruct",
    "model_sha": null,
    "model_dtype": "torch.float16",
    "model_device_type": "cuda",
    "model_pipeline_tag": null,
    "max_concurrent_requests": 128,
    "max_best_of": 2,
    "max_stop_sequences": 4,
    "max_input_tokens": 8191,
    "max_total_tokens": 8192,
    "waiting_served_ratio": 0.3,
    "max_batch_total_tokens": 451520,
    "max_waiting_tokens": 20,
    "max_batch_size": null,
    "validation_workers": 2,
    "max_client_batch_size": 4,
    "router": "text-generation-router",
    "version": "2.1.0",
    "sha": "192d49af0bfa71e886c27856232031f3935628ff",
    "docker_label": "sha-192d49a"
}

TGI版本:2.1.0

信息

  • Docker
  • CLI直接使用

任务

  • 一个官方支持的命令
  • 我自己的修改

重现问题

当我执行以下查询时,需要调用模型中的工具:

curl --location 'http://10.146.240.74:30000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "messages": [
        {
            "content": "You are an assistant who can write the user'\''s last response to a file.\nDetermine the class name from the user description and use it as the name of the txt file, for example CreateIssues.txt.\nSave the file in the raw_data folder.\nRecord the content unchanged as provided by the user and nothing else.\nReturn only the path to the file, for example /raw_data/CreateIssues.txt. Work autonomously according to your specialty, using the tools available to you. Answer briefly and only in your specialty.",
            "role": "system"
        },
        {
            "role": "user",
            "content": "Analyze the content and write to file"
        },
        {
            "role": "user",
            "name": "controller_analizer",
            "content": "Controller '\''CreateIssuesController'\''\n\nМетоды:\n\nGET /api/jira/issues/createFromExcel\n\nНазначение метода: Метод массового создания задач в Jira из Excel файла.\n\nЗаголовки запроса:\nContent-Type: multipart/form-data\n\nВходные параметры:\nПараметр: file\n- Описание: xlsx файл с задачами, которые надо создать\n- Тип: MultipartFile\n- Обязательность: Да\n- Пример значение: файл.xlsx\n\nПример запроса:\nPOST /api/jira/issues/createFromExcel HTTP/1.1\nHost: example.com\nContent-Type: multipart/form-data; boundary=---------------------------1234567890\n\n-----------------------------1234567890\nContent-Disposition: form-data; name=\"file\"; filename=\"file.xlsx\"\nContent-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\n\n... файл.xlsx...\n\n-----------------------------1234567890--\n\nВыходные параметры:\nПараметр: response\n- Описание: Список успешно созданных задач и список не созданных задач с описанием ошибок\n- Тип: JiraTaskCreateResponse\n- Обязательность: Да\n- Пример значение: {\"createdTasks\": [...], \"errors\": [...]}\n\nПример ответа:\nHTTP/1.1 201 Created\nContent-Type: application/json\n\n{\n  \"createdTasks\": [...],\n  \"errors\": [...]\n}\n\nКоды ответа:\n201 Created - успешное создание задач\n400 Bad Request - ошибка при создании задач"
        }
    ],
    "model": "/meta-llama/Meta-Llama-3-8B-Instruct",
    "max_tokens": 1024,
    "temperature": 0.01,
    "n": 50,
    "top_p": 0.9,
    "stream": false,
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "write_document",
                "description": "Create and save a text document. Return path of the saved document file.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "content": {
                            "description": "Text content to be written into the document.",
                            "type": "string"
                        },
                        "file_name": {
                            "description": "File path to save the document.",
                            "type": "string"
                        }
                    },
                    "required": [
                        "content",
                        "file_name"
                    ]
                }
            }
        }
    ],
    "tool_choice": "auto"
}'

我得到错误:

{
    "error": "EOF while parsing an object at line 917 column 1",
    "error_type": "Input validation error"
}

如果你用 "stream": true 调用相同的请求,那么结果是:
output_raw.txt
output.txt
在输出文件output.txt中,所有arguments的值都收集在一行,这里有个奇怪的地方:

  1. 我的JSON Schema和默认工具被添加到了content参数下面的文本中,如下所示。
  2. JSON Schema没有最后一个闭合字符}

预期行为

预期结果:

{
    "id": "",
    "object": "chat.completion",
    "created": 1719709113,
    "model": "/meta-llama/Meta-Llama-3-8B-Instruct",
    "system_fingerprint": "2.1.0-sha-192d49a",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "tool_calls": [
                    {
                        "id": "0",
                        "type": "function",
                        "function": {
                            "description": null,
                            "name": "write_document",
                            "arguments": {
                                "content": "Controller 'CreateIssuesController'\n\nМетоды:\n\nGET /api/jira/issues/createFromExcel\n\nНазначение метода: Метод массового создания задач в Jira из Excel файла.\n\nЗаголовки запроса:\nContent-Type: multipart/form-data\n\nВходные параметры:\nПараметр: file\n- Описание: xlsx файл с задачами, которые надо создать\n- Тип: MultipartFile\n- Обязательность: Да\n- Пример значение: файл.xlsx\n\nПример запроса:\nPOST /api/jira/issues/createFromExcel HTTP/1.1\nHost: example.com\nContent-Type: multipart/form-data; boundary=---------------------------1234567890\n\n-----------------------------1234567890\nContent-Disposition: form-data; name=\"file\"; filename=\"file.xlsx\"\nContent-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\n\n... файл.xlsx...\n\n-----------------------------1234567890--\n\nВыходные параметры:\nПараметр: response\n- Описание: Список успешно созданных задач и список не созданных задач с описанием ошибок\n- Тип: JiraTaskCreateResponse\n- Обязательность: Да\n- Пример значение: {\"createdTasks\": [...], \"errors\": [...]}\n\nПример ответа:\nHTTP/1.1 201 Created\nContent-Type: application/json\n\n{\n  \"createdTasks\": [...],\n  \"errors\": [...]\n}\n\nКоды ответа:\n201 Created - успешное создание задач\n400 Bad Request - ошибка при создании задач",
                                "file_name": "/raw_data/CreateIssues.txt"
                            }
                        }
                    }
                ]
            },
            "logprobs": null,
            "finish_reason": "eos_token"
        }
    ],
    "usage": {
        "prompt_tokens": 647,
        "completion_tokens": 565,
        "total_tokens": 1212
    }
}

谢谢!

cgvd09ve

cgvd09ve1#

你好!
问题是一些未转义的文本字符传递给了输入。
这有助于将文本强制转换为json字符串,例如:

output_text = json.dumps(result["output_text"], ensure_ascii=False)

示例请求:

curl --location 'http://10.146.240.74:30000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "messages": [
        {
            "content": "You are an assistant who can write the user'\''s last response to a file.\nDetermine the class name from the user description and use it as the name of the txt file, for example CreateIssues.txt.\nSave the file in the raw_data folder.\nRecord the content unchanged as provided by the user and nothing else.\nReturn only the path to the file, for example /raw_data/CreateIssues.txt. Work autonomously according to your specialty, using the tools available to you. Answer briefly and only in your specialty.",
            "role": "system"
        },
        {
            "role": "user",
            "content": "Analyze the content and write to file"
        },
        {
            "role": "user",
            "name": "controller_analizer",
            "content": "Описание контроллера '\''CreateIssuesController'\''\\nМетоды:\\nGET /api/jira/issues/createFromExcel\\nНазначение метода: Метод массового создания задач в Jira из Excel файла.\\nЗаголовки запроса:\\nContent-Type: multipart/form-data\\nВходные параметры:\\nПараметр: file\\n- Описание: xlsx файл с задачами, которые надо создать\\n- Тип: MultipartFile\\n- Обязательность: Да\\n- Пример значение: файл.xlsx\\nПример запроса:\\nPOST /api/jira/issues/createFromExcel HTTP/1.1\\nHost: example.com\\nContent-Type: multipart/form-data; boundary=---------------------------1234567890\\n-----------------------------1234567890\\nContent-Disposition: form-data; name=\"file\"; filename=\"file.xlsx\"\\nContent-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\\n... файл.xlsx...\\n-----------------------------1234567890--\\nВыходные параметры:\\nПараметр: response\\n- Описание: Список успешно созданных задач и список не созданных задач с описанием ошибок\\n- Тип: JiraTaskCreateResponse\\n- Обязательность: Да\\n- Пример значение: {\\n\\t\"createdTasks\": [...],\\n\\t\"errors\": [...]}\\nПример ответа:\\nHTTP/1.1 201 Created\\nContent-Type: application/json\\n{\\n\\t\"createdTasks\": [...],\\n\\t\"errors\": [...]}\\nКоды ответа:\\n201 Created - успешное создание задач\\n400 Bad Request - ошибка при создании задач"
        }
    ],
    "model": "/meta-llama/Meta-Llama-3-8B-Instruct",
    "max_tokens": 1024,
    "temperature": 0.01,
    "n": 50,
    "top_p": 0.9,
    "stream": false,
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "write_document",
                "description": "Create and save a text document. Return path of the saved document file.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "content": {
                            "description": "Text content to be written into the document.",
                            "type": "string"
                        },
                        "file_name": {
                            "description": "File path to save the document.",
                            "type": "string"
                        }
                    },
                    "required": [
                        "content",
                        "file_name"
                    ]
                }
            }
        }
    ],
    "tool_choice": "auto"
}'

响应:

{
    "id": "",
    "object": "chat.completion",
    "created": 1719749467,
    "model": "/meta-llama/Meta-Llama-3-8B-Instruct",
    "system_fingerprint": "2.1.0-sha-192d49a",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "tool_calls": [
                    {
                        "id": "0",
                        "type": "function",
                        "function": {
                            "description": null,
                            "name": "write_document",
                            "arguments": {
                                "content": "Описание контроллера 'CreateIssuesController'\nМетоды:\nGET /api/jira/issues/createFromExcel\nНазначение метода: Метод массового создания задач в Jira из Excel файла.\nЗаголовки запроса:\nContent-Type: multipart/form-data\nВходные параметры:\nПараметр: file\n- Описание: xlsx файл с задачами, которые надо создать\n- Тип: MultipartFile\n- Обязательность: Да\n- Пример значение: файл.xlsx\nПример запроса:\nPOST /api/jira/issues/createFromExcel HTTP/1.1\nHost: example.com\nContent-Type: multipart/form-data; boundary=---------------------------1234567890\n-----------------------------1234567890\nContent-Disposition: form-data; name=\"file\"; filename=\"file.xlsx\"\nContent-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\n... файл.xlsx...\n-----------------------------1234567890--\nВыходные параметры:\nПараметр: response\n- Описание: Список успешно созданных задач и список не созданных задач с описанием ошибок\n- Тип: JiraTaskCreateResponse\n- Обязательность: Да\n- Пример значение: {\n\t\"createdTasks\": [...],\n\t\"errors\": [...]}\nПример ответа:\nHTTP/1.1 201 Created\nContent-Type: application/json\n{\n\t\"createdTasks\": [...],\n\t\"errors\": [...]}\nКоды ответа:\n201 Created - успешное создание задач\n400 Bad Request - ошибка при создании задач",
                                "file_name": "/raw_data/CreateIssues.txt"
                            }
                        }
                    }
                ]
            },
            "logprobs": null,
            "finish_reason": "eos_token"
        }
    ],
    "usage": {
        "prompt_tokens": 695,
        "completion_tokens": 380,
        "total_tokens": 1075
    }
}
sxpgvts3

sxpgvts32#

ishelaputov,你只是在Llama 3 8B Instruct上运行那个脚本吗?
我尝试复制你的脚本,当我通过openai使用gpt-4时,它可以正常工作,但是当我将最新的TGI版本作为端点时,当我尝试使用工具(如果注解掉tool_choice="auto",则不会出现问题)时,它会失败。

2exbekwf

2exbekwf3#

@RonanKMcGovern ,是的,在Llama 3 8B Instruct和Llama 3 70B Instruct上。
如果去掉tool_choice="auto",行为与您描述的相同,它会被执行。
在LangChain中,我通过ChatOpenAI来实现-这也是一个错误。

mlmc2os5

mlmc2os54#

对不起,我困惑了。让我看看我是否能重复你所说的话。

  1. 当你在 tool_choice="auto" 离开时,会出现这个错误:
python llama3_raw.py
Traceback (most recent call last):
  File "/Users/ronanmcgovern/TR/function-calling-v4/tests/llama3_raw.py", line 196, in <module>
    run_conversation()
  File "/Users/ronanmcgovern/TR/function-calling-v4/tests/llama3_raw.py", line 148, in run_conversation
    response = client.chat.completions.create(
  File "/Users/ronanmcgovern/TR/function-calling-v4/trelisEnv/lib/python3.10/site-packages/openai/_utils/_utils.py", line 277, in wrapper
    return func(*args, **kwargs)
  File "/Users/ronanmcgovern/TR/function-calling-v4/trelisEnv/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 643, in create
    return self._post(
  File "/Users/ronanmcgovern/TR/function-calling-v4/trelisEnv/lib/python3.10/site-packages/openai/_base_client.py", line 1261, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/Users/ronanmcgovern/TR/function-calling-v4/trelisEnv/lib/python3.10/site-packages/openai/_base_client.py", line 942, in request
    return self._request(
  File "/Users/ronanmcgovern/TR/function-calling-v4/trelisEnv/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
    return self._retry_request(...

基本上,_base_client 中有一些错误(为什么?我不确定)

  1. 如果我移除 tool_choice="auto" ,那么我就只是没有调用任何工具:
llama3_raw.py
Response:  ChatCompletion(id='', choices=[Choice(finish_reason='length', index=0, logprobs=None, message=ChatCompletionMessage(content="San Francisco's weather is notoriously changeable, but I'll give you a general idea!\n\nSan Francisco is known for its mild and cool oceanic climate (Köppen climate type: Cfb). Here's what you can expect:\n\n1. **Cool summers:** Daytime temperatures usually range from 60°F (16°C) to 75°F (24°C), while nighttime temperatures can dip to around 50°F (10°C) to 60°F (16°C).\n2. **", role='assistant', function_call=None, tool_calls=None))], created=1720542803, model='Trelis/Meta-Llama-3-8B-Instruct-function-calling', object='chat.completion', service_tier=None, system_fingerprint='2.1.2-dev0-sha-4c976fb', usage=CompletionUsage(completion_tokens=100, prompt_tokens=19, total_tokens=119))
ChatCompletionMessage(content="San Francisco's weather is notoriously changeable, but I'll give you a general idea!\n\nSan Francisco is known for its mild and cool oceanic climate (Köppen climate type: Cfb). Here's what you can expect:\n\n1. **Cool summers:** Daytime temperatures usually range from 60°F (16°C) to 75°F (24°C), while nighttime temperatures can dip to around 50°F (10°C) to 60°F (16°C).\n2. **", role='assistant', function_call=None, tool_calls=None)

脚本是这样的:

import os
from dotenv import load_dotenv
from openai import OpenAI

# Load environment variables from the .env file in the parent directory
load_dotenv('../.env')

# Get the API URL and model name from environment variables
openai_api_base = os.getenv('API_URL') + 'v1/'
model_name = os.getenv('MODEL_NAME')
api_key = os.getenv('API_KEY', default="EMPTY")

# Initialize the OpenAI client
client = OpenAI(
    base_url=openai_api_base,
    api_key=api_key,
)

def run_conversation():
    # Step 1: send the conversation and available functions to the model
    messages = [{"role": "user", "content": "What's the weather like in San Francisco?"}]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location"],
                },
            },
        }
    ]
    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        tools=tools,
        # tool_choice="auto",  # auto is default, but we'll be explicit
    )
    print("Response: ", response)
    response_message = response.choices[0].message
    print(response_message)

run_conversation()

我们都在说同样的事情吗?我们达成一致了吗?
在一个相关的主题上,你有任何想法知道在这个仓库中哪里可以找到描述 a) 当 llm 生成时如何将工具注入到提示中的代码 b) 如何解析响应以便服务器返回 API 响应的代码吗?我已经在那 here 上发了一个问题。
非常感谢。

wvt8vs2t

wvt8vs2t5#

你好!
为了排除 Package 器(openAI或LangChain)中存在任何缺陷的影响,我直接从Postman复制了错误。上面是我的请求示例。顺便说一下,您可以在那里看到如何将工具输入到LLM的提示中;它们位于消息列表下方的结构中。
您和我正在谈论同一件事,对吧?如果指定了tool_choice并且消息中有某些字符或其他内容导致EOF,则会发生错误。如果完全不指定tool_choice,则没有选择工具的可能性,这样的请求的意义就会丢失,但不会发生EOF错误。
非常感谢!

iyr7buue

iyr7buue6#

@RonanKMcGovern ,似乎我没有正确理解您的问题,很遗憾,我不知道如何将工具插入到模型访问请求中。
您提到的关联问题很有帮助,非常感谢您!

相关问题