系统信息
1xL40节点在Runpod上运行
最新的huggingface/text-generation-inference:latest
docker镜像。
命令:--model-id HuggingFaceM4/idefics2-8b --port 8080 --max-input-length 3000 --max-total-tokens 4096 --max-batch-prefill-tokens 4096 --speculate 3 --lora-adapters orionsoftware/rater-adapter-v0.0.1
信息
- Docker
- CLI直接使用
任务
- 一个官方支持的命令
- 我自己的修改
重现过程
我正在尝试使用huggingface/text-generation-inference:latest
docker镜像部署一个idefics2 LoRA。
我正在运行的命令是--model-id HuggingFaceM4/idefics2-8b --port 8080 --max-input-length 3000 --max-total-tokens 4096 --max-batch-prefill-tokens 4096 --speculate 3 --lora-adapters orionsoftware/rater-adapter-v0.0.1
。
我也有一个正确的HF令牌来访问orionsoftware/rater-adapter-v0.0.1。
在没有--lora-adapters orionsoftware/rater-adapter-v0.0.1
部分的情况下,它运行得很好。但是一旦我添加了LoRA,我就遇到了这个错误:
2024-08-07T14:53:12.382413544Z �[2m2024-08-07T14:53:12.382183Z�[0m �[32m INFO�[0m �[2mtext_generation_launcher�[0m�[2m:�[0m Loading adapter weights into model: orionsoftware/rater-adapter-v0.0.1
2024-08-07T14:53:12.526786055Z �[2m2024-08-07T14:53:12.526533Z�[0m �[31mERROR�[0m �[2mtext_generation_launcher�[0m�[2m:�[0m Error when initializing model
2024-08-07T14:53:12.526839016Z Traceback (most recent call last):
2024-08-07T14:53:12.526843694Z File "/opt/conda/bin/text-generation-server", line 8, in <module>
2024-08-07T14:53:12.526847692Z sys.exit(app())
2024-08-07T14:53:12.526851690Z File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in __call__
2024-08-07T14:53:12.526855617Z return get_command(self)(*args, **kwargs)
2024-08-07T14:53:12.526858793Z File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
2024-08-07T14:53:12.526862660Z return self.main(*args, **kwargs)
2024-08-07T14:53:12.526865856Z File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
2024-08-07T14:53:12.526869113Z return _main(
2024-08-07T14:53:12.526872248Z File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
2024-08-07T14:53:12.526875775Z rv = self.invoke(ctx)
2024-08-07T14:53:12.526879843Z File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
2024-08-07T14:53:12.526883249Z return _process_result(sub_ctx.command.invoke(sub_ctx))
2024-08-07T14:53:12.526886255Z File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
2024-08-07T14:53:12.526889331Z return ctx.invoke(self.callback, **ctx.params)
2024-08-07T14:53:12.526892567Z File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
2024-08-07T14:53:12.526916632Z return __callback(*args, **kwargs)
2024-08-07T14:53:12.526919849Z File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
2024-08-07T14:53:12.526922954Z return callback(**use_params) # type: ignore
2024-08-07T14:53:12.526925820Z File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 109, in serve
2024-08-07T14:53:12.526929326Z server.serve(
2024-08-07T14:53:12.526932332Z File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 274, in serve
2024-08-07T14:53:12.526935638Z asyncio.run(
2024-08-07T14:53:12.526938885Z File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
2024-08-07T14:53:12.526941910Z return loop.run_until_complete(main)
2024-08-07T14:53:12.526945066Z File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
2024-08-07T14:53:12.526948322Z self.run_forever()
2024-08-07T14:53:12.526951238Z File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
2024-08-07T14:53:12.526954875Z self._run_once()
2024-08-07T14:53:12.526957790Z File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
2024-08-07T14:53:12.526960856Z handle._run()
2024-08-07T14:53:12.526964022Z File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
2024-08-07T14:53:12.526967298Z self._context.run(self._callback, *self._args)
2024-08-07T14:53:12.526971727Z > File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 229, in serve_inner
2024-08-07T14:53:12.526974853Z model = get_model_with_lora_adapters(
2024-08-07T14:53:12.526977828Z File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 1216, in get_model_with_lora_adapters
2024-08-07T14:53:12.526983248Z 1 if layer_name == "lm_head" else len(model.model.model.layers)
2024-08-07T14:53:12.526986344Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1729, in __getattr__
2024-08-07T14:53:12.526989230Z raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
2024-08-07T14:53:12.526992286Z AttributeError: 'Idefics2ForConditionalGeneration' object has no attribute 'model'
这是在Runpod上的1xL40节点上运行的。orionsoftware/rater-adapter-v0.0.1
使用transformers
和Trainer
进行训练,看起来像这样:
我很好奇我做错了什么。不幸的是,我的弱Python技能使我无法进一步调试这个问题。
预期行为
预期模型能够正确地提供服务,没有错误。
1条答案
按热度按时间scyqe7ek1#
编辑:修复错误日志