vllm 从Docker中运行本地模型文件失败

lmyy7pcs  于 2个月前  发布在  Docker
关注(0)|答案(1)|浏览(87)

我尝试从Docker镜像运行vllm,按照官方教程进行。我的模型文件存储在/home/appuser/repo/models/Qwen-14b-Chat-AWQ,我使用以下命令启动了Docker镜像:
docker run --gpus all -v /home/appuser/repo/models:/root/.cache/huggingface -p 8800:8000 --ipc=host vllm/vllm-openai:latest --model Qwen-14B-Chat-AWQ --quantization awq --tensor-parallel-size 2
然而,vllm报告无法加载模型文件:

OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like Qwen-14B-Chat-AWQ is not the path to a directory containing a file named config.json.

然后,我查看了镜像,

docker run --rm -it -v /home/appuser/repo/models:/root/.cache/huggingface --entrypoint bash  vllm/vllm-openai:latest
ls /root/.cache/huggingface/   
Qwen-14B-Chat-AWQ

显然,模型文件夹就在那里。
有人能告诉我如何将本地模型文件挂载到vllm Docker镜像中吗?

qlzsbp2j

qlzsbp2j1#

docker run --gpus all
-v /home/appuser/repo/models/:/mnt/model/
-p 8800:8000
--env "TRANSFORMERS_OFFLINE=1"
--env "HF_DATASET_OFFLINE=1"
--ipc=host vllm/vllm-openai:latest
--model="/mnt/model/"
--quantization awq --tensor-parallel-size 2

相关问题