text-generation-inference 工具调用在v2.2.0版本中的表现比最新版本差,

sulc1iza 于 5个月前发布在其他

关注(0)|答案(4)|浏览(96)

系统信息

gpu=0
num_gpus=1
model=meta-llama/Meta-Llama-3.1-8B-Instruct
docker run -d \
--gpus "\"device=$gpu\"" \
--shm-size 16g \
-e HUGGING_FACE_HUB_TOKEN=$token \
-p 8082:80 \
-v $volume:/data \
--name Meta-Llama-3.1-8B \
ghcr.io/huggingface/text-generation-inference:sha-1cebccc@sha256:4ccb775aaaefc90df10b2de7ce17a1f00a07682c12ea9630e6e6fdfa10a1c05e \
--model-id $model \
--max-concurrent-requests $max_concurrent_request \
--max-total-tokens $max_total_token \
--max-input-length $max_input_length \
--waiting-served-ratio $wsr \
--num-shard $num_gpus \
--dtype bfloat16

操作系统：Ubuntu Linux
型号：meta-llama/Meta-Llama-3.1-8B-Instruct / meta-llama/Meta-Llama-3-8B-Instruct
硬件：A100 80G
出现问题的版本：v2.2.0
与最新版本相比：未知

信息

Docker
CLI直接使用

任务

一个官方支持的命令
自己的修改

复现步骤

启动Docker示例。
运行以下命令：

client = OpenAI(
        base_url=f"http://127.0.0.1:8082/v1",
        api_key="_",
    )

chat_completion = client.chat.completions.create(
    model="tgi",
    messages=messages,
    tools=tools,
    tool_choice="auto",
    max_tokens=max_tokens,
)

predictions = chat_completion.choices[0].message.tool_calls

text-generation-inference

来源：https://github.com/huggingface/text-generation-inference/issues/2413

4条答案

按热度按时间

woobm2wo1#

你好，@varad0309 ,感谢你打开这个问题。v2.2.0 大约在3周前发布，TGI自那以来已经修复了一些bug并进行了改进，这些改进在最新版本中可用。具体来说，昨天合并了一个与工具相关的bug修复 #2406 ,这可能会提高工具调用的响应速度。

我们将在接下来的一周/几周内发布一个更新版本，它应该包括这些修复以及许多其他改进！目前，我建议你使用最新版本或固定提交的版本，以确保你使用的是带有工具修复的版本。再次感谢！

赞(0）回复(0）举报 5个月前

bweufnob2#

感谢drbh的快速回复。我确实尝试过几个小时前提交的commit(更具体地说，是sha-1cebccc@sha256:4ccb775aaaefc90df10b2de7ce17a1f00a07682c12ea9630e6e6fdfa10a1c05e)。问题仍然存在。
我的观察：可用工具列表仍然没有被适当传递。

赞(0）回复(0）举报 5个月前

nxowjjhe3#

抱歉，我可能误解了问题。看起来工具的响应从2.2.0版本开始有所退化？你能分享一个输入和预期输出的例子吗？此外，你知道工具最后一次按照你的预期工作是什么时候吗(可能是一个版本或者最好的情况是上一次的提交哈希值)?谢谢！

赞(0）回复(0）举报 5个月前

7z5jn7bk4#

当然，这里有几个例子。很遗憾，我不知道它开始出现问题的那个版本之后的版本。我比较的版本是通过Docker镜像获得的：

(这个有效) 版本1 => ghcr.io/huggingface/text-generation-inference:latest
(这个无效) 版本2 => ghcr.io/huggingface/text-generation-inference:sha-1cebccc@sha256:4ccb775aaaefc90df10b2de7ce17a1f00a07682c12ea9630e6e6fdfa10a1c05e
示例：

Ground truth: [{'name': 'search_hotel', 'arguments': {'destination': 'Paris', 'check_in_date': '2022-05-01', 'check_out_date': '2022-05-10'}}]
Version 1: [Function(arguments={'check_in_date': '2022-05-01', 'check_out_date': '2022-05-10', 'location': 'Paris', 'num_guests': 1, 'num_rooms': 1}, name='search_hotel', description=None)]
Version 2: [Function(arguments={'number': 7}, name='find_hotels', description=None)]

Ground truth: [{'name': 'roll_dice', 'arguments': {'sides': 6, 'quantity': 1}}]
Version 1: [Function(arguments={'quantity': 1, 'sides': 6}, name='roll_dice', description=None)]
Version 2: [Function(arguments={'artist': 'tools', 'genre': 'RNG Tools'}, name='random.randint', description=None)]

Ground truth: [{'name': 'calculate_fuel_cost', 'arguments': {'distance': 500, 'fuel_price': 1.2, 'fuel_efficiency': 10}}]
Version 1: [Function(arguments={'distance': 500, 'fuel_efficiency': 10, 'fuel_price': 1.2}, name='calculate_fuel_cost', description=None)]
Version 2: [Function(arguments={'distance': 500, 'fuel_efficiency': 10, 'fuel_price': 1.2}, name='calculate_fuel_consumption', description=None)]

赞(0）回复(0）举报 5个月前

我来回答

text-generation-inference 工具调用在v2.2.0版本中的表现比最新版本差,

系统信息

信息

任务

复现步骤

4条答案

相关问题

热门标签

最新问答