inference QUESTION: How to deploy as extensible docker image

p4tfgftt  于 5个月前  发布在  Docker
关注(0)|答案(5)|浏览(126)

Now I am ready to make Xinference into an extensible image based on bge-reranker-large. How can I define the same model_id at startup and call it remotely through load balancing to create a pure GPU service like triton

我现在准备将Xinference做成可扩展的镜像,基于bge-reranker-large的,如何统一在启动时定义好相同的model_id,通过负载均衡来远程调用,像triton一样制作一个单纯的GPU服务

zzwlnbp8

zzwlnbp81#

你好,@Loovelj 。在启动模型时,你可以指定 model_uid

xinference launch -u <your_specific_model_uid> -n bge-reranker-large -t rerank
bz4sfanl

bz4sfanl2#

你好,@Loovelj。你在启动模型时遇到了错误,可以尝试以下方法解决:

  1. 确保你的模型文件(my_model)已经上传到正确的路径。
  2. 检查模型名称(bge-reranker-large)是否正确。
  3. 检查模型类型(rerank)是否正确。
  4. 检查模型UID(my_model)是否正确。
  5. 确保你的网络连接正常,没有被防火墙或代理服务器阻止。

如果问题仍然存在,请查看日志以获取更多详细信息,并尝试重新启动模型。

1sbrub3j

1sbrub3j3#

Hi,@Loovelj。在启动模型时,您可以指定参数 model_uid

xinference launch -u <your_specific_model_uid> -n bge-reranker-large -t rerank

谢谢,我已经知道了,但是这个有错误异常,能帮我解决吗?

`xinference launch --model-name bge-reranker-large --model-type rerank --model-uid my_model 2023-12-13 20:27:00,171 - modelscope - INFO - PyTorch version 2.1.1 Found. 2023-12-13 20:27:00,173 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer 2023-12-13 20:27:02,440 - modelscope - INFO - Loading done! Current index file version is 1.10.0, with md5 4b2f06948d8e1e0227d48712befb1bbb and a total number of 946 components indexed Traceback (most recent call last): File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/connectionpool.py", line 715, in urlopen httplib_response = self._make_request( File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/connectionpool.py", line 416, in _make_request conn.request(method, url, **httplib_request_kw) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 1285, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 980, in send self.connect() File "/root/anaconda3/envs/bge

9gm1akwq

9gm1akwq4#

从错误日志来看,问题出在连接被拒绝。请检查以下几点:

  1. 确保您的模型已经启动并运行在正确的端口上。您可以使用model_uid作为模型ID来启动模型。
  2. 检查您的Xinference端点是否正确。您可以尝试使用-e选项指定Xinference端点。
  3. 确保您的网络连接正常,没有防火墙或其他安全设置阻止连接。

如果问题仍然存在,请提供更多关于您的环境和配置的信息,以便我们能更好地帮助您解决问题。

相关问题