功能请求
为 microsoft/Phi-3.5-MoE-instruct 添加对 PhiMoEForCausalLM
架构的支持。
动机
它在以下错误中失败:
2024-08-25 21:25:51.891 | INFO | text_generation_server.utils.import_utils:<module>:75 - Detected system cuda
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in <module>
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 118, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 297, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 231, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 1064, in get_model
raise NotImplementedError("sharded is not supported for AutoModel")
NotImplementedError: sharded is not supported for AutoModel
rank=3
2024-08-25T21:25:56.550031Z ERROR text_generation_launcher: Shard 3 failed to start
2024-08-25T21:25:56.550058Z INFO text_generation_launcher: Shutting down shards
你的贡献
我可以测试任何 PR
1条答案
按热度按时间bmvo0sr51#
感谢您报告此问题 @maziyarpanahi 👍
我们目前没有很多额外的带宽,但我们可能会优先添加这个模型。
另外,请注意,为了表示对某个模型的需求更多,在您的问题上获得点赞或类似的React是一个信号,让我们优先考虑某些事情 :)