DeepSpeed-MII 服务器因某种原因崩溃,无法继续进行,

h22fl7wq  于 2个月前  发布在  其他
关注(0)|答案(1)|浏览(140)

执行以下脚本后,仍然出现错误。

python -m mii.entrypoints.openai_api_server 
 --model "/logs/llama-2-70b-chat/" 
 --port 8000 
 --host 0.0.0.0 
 --tensor-parallel 2

错误信息:

Traceback (most recent call last):
 File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
 return _run_code(code, main_globals, None,
 File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
 exec(code, run_globals)
 File "/usr/local/lib/python3.10/dist-packages/mii/entrypoints/openai_api_server.py", line 506, in 
 mii.serve(app_settings.model_id,
 File "/usr/local/lib/python3.10/dist-packages/mii/api.py", line 155, in serve
 import_score_file(mii_config.deployment_name, DeploymentType.LOCAL).init()
 File "/tmp/mii_cache/deepspeed-mii/score.py", line 33, in init
 mii.backend.MIIServer(mii_config)
 File "/usr/local/lib/python3.10/dist-packages/mii/backend/server.py", line 47, in **init**
 self._wait_until_server_is_live(processes,
 File "/usr/local/lib/python3.10/dist-packages/mii/backend/server.py", line 62, in _wait_until_server_is_live
 raise RuntimeError(
 RuntimeError: server crashed for some reason, unable to proceed
omvjsjqw

omvjsjqw1#

你好,@Archmilio,你可以尝试在管道中运行模型吗?我怀疑当加载模型时服务器崩溃了,但由于它是一个单独的进程,所以实际的错误没有显示出来:

import mii
pipe = mii.pipeline("/logs/llama-2-70b-chat/", tensor_parallel=2)
print(pipe("DeepSpeed is"))

使用 deepspeed --num_gpus 2 example.py 运行此示例。

相关问题