text-generation-inference .bin权重文件未找到，用于模型,

ppcbkaq5 于 2个月前发布在其他

关注(0)|答案(8)|浏览(52)

系统信息
在Docker https://huggingface.co/huggingface/falcon-40b-gptq?text=My+name+is+Lewis+and+I+like+to上运行此命令后，我遇到了这个错误。
huggingface_hub.utils._errors.EntryNotFoundError: 未找到模型huggingface/falcon-40b-gptq的.bin权重，修订版本为None。

信息

Docker
CLI直接

任务

一个官方支持的命令
我自己的修改

重现

sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:0.8 --model-id $model --num-shard $num_shard --quantize gptq

预期行为

运行

text-generation-inference

来源：https://github.com/huggingface/text-generation-inference/issues/466

8条答案

按热度按时间

x6h2sr281#

你好，@mayurtikundi12。为了使这个模型正常工作，你需要使用最新的版本。我们即将发布0.9版本，届时应该可以正常工作。@OlivierDehaene(负责可视化)

赞(0）回复(0）举报 2个月前

qf9go6mv2#

在1.1.0版本中，它无法正常工作。

赞(0）回复(0）举报 2个月前

zrfyljdw3#

@chintanckg,添加--quantize gptq

赞(0）回复(0）举报 2个月前

4dc9hkyq4#

model=sigmareaver/flan-ul2-4bit-128g-gptq

volume=$PWD/flan-ul2-4bit-128g-gptq-data

docker run --gpus all --shm-size 24g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:latest --model-id $model --max-total-tokens 5024 --max-input-length 4096 --num-shard 4 --max-concurrent-requests 128 --quantize gptq

输出：

Traceback (most recent call last):

 File "/opt/conda/bin/text-generation-server", line 8, in <module>
   sys.exit(app())

 File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 115, in download_weights
   utils.weight_files(model_id, revision, extension)

 File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 101, in weight_files
   pt_filenames = weight_hub_files(model_id, revision, extension=".bin")

 File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 37, in weight_hub_files
   raise EntryNotFoundError(

huggingface_hub.utils._errors.EntryNotFoundError: No .bin weights found for model sigmareaver/flan-ul2-4bit-128g-gptq and revision None.

@OlivierDehaene - 仍然存在相同的问题，请建议。

赞(0）回复(0）举报 2个月前

kuarbcqp5#

当我通过路径加载包含safetensors且没有.bin权重的模型时，我也看到了这个错误。
TheBloke/Llama-2-7B-Chat-GPTQ

赞(0）回复(0）举报 2个月前

zour9fqk6#

这段文本内容是关于两个错误问题的讨论。第一个问题出现在text-generation-inference/server/text_generation_server/cli.py文件的第156行，其中硬编码了".bin"。第二个问题出现在text-generation-inference/server/text_generation_server/utils/hub.py文件的第95至99行，当尝试获取权重文件时，如果扩展名不是".safetensors",则会引发异常。

翻译后的中文文本如下：

在text-generation-inference/server/text_generation_server/cli.py文件的第156行，这里不应该硬编码".bin"。我认为这可能是导致这个问题的原因。因为它会触发以下代码块并引发错误：

text-generation-inference/server/text_generation_server/utils/hub.py
第95至99行：[96a982a](https://github.com/huggingface/text-generation-inference/commit/96a982ad8fc232479384476b1596a880697cc1d0)
|  | try: |
|  | filenames=weight_hub_files(model_id, revision, extension) |
|  | exceptEntryNotFoundErrorase: |
|  | ifextension!=".safetensors": |
|  | raisee |

赞(0）回复(0）举报 2个月前

1zmg4dgp7#

与一个.gguf微调模型有同样的问题。有任何更新吗？

赞(0）回复(0）举报 2个月前

bsxbgnwa8#

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

赞(0）回复(0）举报 2个月前