inference QUESTION: How to deploy as extensible docker image

p4tfgftt 于 8个月前发布在 Docker

关注(0)|答案(5)|浏览(193)

Now I am ready to make Xinference into an extensible image based on bge-reranker-large. How can I define the same model_id at startup and call it remotely through load balancing to create a pure GPU service like triton

我现在准备将Xinference做成可扩展的镜像，基于bge-reranker-large的，如何统一在启动时定义好相同的model_id，通过负载均衡来远程调用，像triton一样制作一个单纯的GPU服务

inference

来源：https://github.com/xorbitsai/inference/issues/761

5条答案

按热度按时间

zzwlnbp81#

你好，@Loovelj 。在启动模型时，你可以指定 model_uid 。

xinference launch -u <your_specific_model_uid> -n bge-reranker-large -t rerank

赞(0）回复(0）举报 8个月前

bz4sfanl2#

你好，@Loovelj。你在启动模型时遇到了错误，可以尝试以下方法解决：

确保你的模型文件(my_model)已经上传到正确的路径。
检查模型名称(bge-reranker-large)是否正确。
检查模型类型(rerank)是否正确。
检查模型UID(my_model)是否正确。
确保你的网络连接正常，没有被防火墙或代理服务器阻止。

如果问题仍然存在，请查看日志以获取更多详细信息，并尝试重新启动模型。

赞(0）回复(0）举报 8个月前

1sbrub3j3#

Hi,@Loovelj。在启动模型时，您可以指定参数 model_uid。

xinference launch -u <your_specific_model_uid> -n bge-reranker-large -t rerank

谢谢，我已经知道了，但是这个有错误异常，能帮我解决吗？

`xinference launch --model-name bge-reranker-large --model-type rerank --model-uid my_model 2023-12-13 20:27:00,171 - modelscope - INFO - PyTorch version 2.1.1 Found. 2023-12-13 20:27:00,173 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer 2023-12-13 20:27:02,440 - modelscope - INFO - Loading done! Current index file version is 1.10.0, with md5 4b2f06948d8e1e0227d48712befb1bbb and a total number of 946 components indexed Traceback (most recent call last): File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/connectionpool.py", line 715, in urlopen httplib_response = self._make_request( File "/root/anaconda3/envs/bge/lib/python3.9/site-packages/urllib3/connectionpool.py", line 416, in _make_request conn.request(method, url, **httplib_request_kw) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 1285, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 1331, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 1280, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 1040, in _send_output self.send(msg) File "/root/anaconda3/envs/bge/lib/python3.9/http/client.py", line 980, in send self.connect() File "/root/anaconda3/envs/bge

展开查看全部

赞(0）回复(0）举报 8个月前

9gm1akwq4#

从错误日志来看，问题出在连接被拒绝。请检查以下几点：