系统信息
操作系统:
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
使用的模型: mistralai/Mistral-7B-Instruct-v0.3
硬件: 1 L4
尝试了最新版本的docker镜像。
信息
- Docker
- 直接使用CLI
任务
- 一个官方支持的命令
- 我自己的修改
重现
使用以下命令启动服务器:
docker run --gpus all --shm-size 1g -p 8080:80 -e HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN -v $PWD/data:/data ghcr.io/huggingface/text-generation-inference:latest --model-id mistralai/Mistral-7B-Instruct-v0.3
然后发送以下调用:
import requests
conversation = [
{"role": "user", "content": "What's the weather like in Paris?"},
]
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get▁the▁current▁weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
"format": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit to use. Infer this from the users location.",
},
},
"required": ["location", "format"],
},
},
}
]
response = requests.post(
url="http://localhost:8080/v1/chat/completions",
json={
"messages": conversation,
"model": "mistralai/Mistral-7B-Instruct-v0.2",
"temperature": 0.1,
"tool_choice": "required",
# "tool_prompt": "\"You will be presented with a JSON schema representing a set of tools.\nIf the user request lacks of sufficient information to make a precise tool selection: Do not invent any tool's properties, instead notify with an error message.\n\nJSON Schema:\n\"",
"tools": tools,
"max_tokens": 1000,
},
)
错误:
(task, pid=12212) 2024-05-29T14:56:04.338119Z INFO text_generation_router: router/src/main.rs:369: Connected
(task, pid=12212) 2024-05-29T14:56:04.338153Z WARN text_generation_router: router/src/main.rs:383: Invalid hostname, defaulting to 0.0.0.0
(task, pid=12212) 2024-05-29T14:58:01.008313Z INFO chat_completions{total_time="5.576392398s" validation_time="1.850855ms" queue_time="130.083µs" inference_time="5.574411606s" time_per_token="61.937906ms" seed="Some(14966871623831239824)"}: text_generation_router::server: router/src/server.rs:322: Success
(task, pid=12212) thread 'tokio-runtime-worker' panicked at router/src/infer.rs:407:44:
(task, pid=12212) Tool with name required not found
(task, pid=12212) note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
(task, pid=12212) 2024-05-29T14:58:07.628698Z ERROR text_generation_launcher: Webserver Crashed
(task, pid=12212) 2024-05-29T14:58:07.629433Z INFO text_generation_launcher: Shutting down shards
(task, pid=12212) 2024-05-29T14:58:07.631861Z INFO shard-manager: text_generation_launcher: Terminating shard rank=0
(task, pid=12212) 2024-05-29T14:58:07.631937Z INFO shard-manager: text_generation_launcher: Waiting for shard to gracefully shutdown rank=0
(task, pid=12212) 2024-05-29T14:58:09.433647Z INFO shard-manager: text_generation_launcher: shard terminated rank=0
预期行为
我认为服务器应该发送一个类似 {'error': 'Input validation error: Tool with name required not found', 'error_type': 'validation'}
的错误消息,但不应该崩溃。
1条答案
按热度按时间9ceoxa921#
你好,我确认了,我已经遇到过这个问题。