问题描述
- 无法正常上传文件,上传第一个文件一直报错:milvus插入失败,请稍后再试。第二个第三个没问题,但出现该问题后,该知识库无法正常删除,报错请求失败,该文件同样无法删除,docx,ppt,txt均尝试了。
- 无法正常问答,问答一直报错请刷新后重试。测试了ollama,本地Qwen,应该不是模型问题,看sanic.log,问题应该出在es上面。
- 期望行为是解决BUG。
- 运行环境:
- OS:Centos8
- NVIDIA Driver:535
- CUDA:11.8
- docker:2.24.0
- docker-compose:2.27.0
- NVIDIA GPU:4090
- NVIDIA GPU Memory:24GB
- Qanything日志
解决方案
根据问题描述和日志分析,可能的原因是Elasticsearch连接超时导致的问题。可以尝试以下解决方案:
- 检查Elasticsearch服务是否正常运行,如果没有正常运行,请修复Elasticsearch服务。
- 增加Elasticsearch的连接超时时间。可以在
elastic_transport.transport
中设置timeout
参数,例如将超时时间设置为60秒:
INFO:elastic_transport.transport:GET [http://es-container-local:9200/](http://es-container-local:9200/) [status:200 duration:0.002s]
修改为:
INFO:elastic_transport.transport:GET [http://es-container-local:9200/](http://es-container-local:9200/) [status:200 duration:0.002s timeout:60s]
- 如果问题仍然存在,可以考虑查看Elasticsearch的日志以获取更多详细信息,或者联系Elasticsearch技术支持寻求帮助。
离线部署centos文档.docx (insert count: 170, delete count: 0, upsert count: 0, timestamp: 450132533227028483, success count: 170, err count: 0)
INFO:debug_logger:now inser_file 离线部署centos文档.docx
INFO:debug_logger:Inserting into Milvus...
INFO:debug_logger:离线部署centos文档.docx (insert count: 170, delete count: 0, upsert count: 0, timestamp: 450132533227028483, success count: 170, err count: 0)
INFO:debug_logger:now inser_file for es: 离线部署centos文档.docx
INFO:debug_logger:Inserting into Es ...
ES## - Index zzp++kb6fda41a1c95f4605ab812263d921740a already exists. Skipping creation.
INFO:debug_logger:list_docs zzp
INFO:debug_logger:kb_id: KB7e4842c2615c40649e78754c8e39203b
INFO:debug_logger:delete_knowledge_base zzp
INFO:debug_logger:check_kb_exist [('KB7e4842c2615c40649e78754c8e39203b',)]
INFO:debug_logger:collection zzp exists
INFO:debug_logger:partitions: ['KB7e4842c2615c40649e78754c8e39203b']
INFO:debug_logger:GET http://es-container-local:9200/zzp%2B%2Bkb7e4842c2615c40649e78754c8e39203b?ignore_unavailable=true [status:200 duration:0.002s]
WARNING:elastic_transport.node_pool:Node <Urllib3HttpNode( has been marked alive after a successful request
INFO:debug_logger:##ES## - success to connect to {'name': 'd9dec8e8aa4a', 'cluster_name': 'docker-cluster', 'cluster_uuid': 'mocOuywHSNa08Yp98Kcy7A', 'version': {'number': '8.11.4', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': 'da06c53fd49b7e676ccf8a32d6655c5155c16d81', 'build_date': '2024-01-08T10:05:08.438562403Z', 'build_snapshot': False, 'lucene_version': '9.8.0', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'}
INFO:debug_logger:##ES## - success to delete index: ['zzp++kb7e4842c2615c40649e78754c8e39203b']
INFO:debug_logger:##ES## - success delete kb_ids: ['KB7e4842c2615c40649e78754c8e39203b']
INFO:debug_logger:##ES## - success delete kb_ids: ['KB6fda41a1c95f4605ab812263d921740a']
INFO:debug_logger:delete_knowledge_base: ['KB6fda41a1c95f4605ab812263d921740a']
INFO:debug_logger:list_kbs zzp
INFO:debug_logger:all kb infos: [{'kb_id': 'KB6fda41a1c95f4605ab812263d921740a'}]
INFO:debug_logger:PUT http://es-container-local:9200/zzp%2B%2Bkb6fda41a1c95f4605ab812263d921740a/_bulk?refresh=false [status:N/A duration:10.010s]
WARNING:elastic_transport.node_pool:Node <Urllib3HttpNode( has failed for 1 times in a row, putting on 1 second timeout
INFO:debug_logger:离线部署centos文档.docx Error adding texts: Connection timed out
INFO:debug_logger:insert time: 10.025246858596802
INFO:debug_logger:insert_to_milvus: success num=1, failed num=0
INFO:debug_logger:list_docs zzp
INFO:debug_logger:list_docs zzp
INFO:debug_logger:delete_docs zzp
INFO:debug_logger:check_kb_exist [('KB6fda41a1c95f4605ab812263d921740a',)]
INFO:debug_logger:check_file_exist [('cd4eca5871ac4bb7a43fb7ca0058815d',)]
INFO:debug_logger:match milvus_client = <qanything_kernel.connector.database.milvus.milvus_client.MilvusClient object at 0x7f37c10e3f70>
INFO:debug_logger:milvus delete files_id=['cd4eca5871ac4bb7a43fb7ca0058815d']
WARNING:elastic_transport.node_pool:Resurrected node <Urllib3HttpNode( (force=False)
INFO:debug_logger:##ES## - success to connect to {'name': 'd9dec8e8aa4a', 'cluster_name': 'docker-cluster', 'cluster_uuid': 'mocOuywHSNa08Yp98Kcy7A', 'version': {'number': '8.11.4', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': 'da06c53fd49b7e676ccf8a32d6655c5155c16d81', 'build_date': '2024-01-08T10:05:08.438562403Z', 'build_snapshot': False, 'lucene_version': '9.8.0', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'}
WARNING:elastic_transport.node_pool | Node <Urllib3HttpNode( has been marked alive after a successful request
WARNING:elastic_transport.node_pool | Node <Urllib3HttpNode( has failed for 1 times in a row, putting on 1 second timeout
WARNING:elastic_transport.node
这是一个Sanic应用程序的错误日志,其中涉及到了Milvus数据库的操作。在尝试回答问题时,程序遇到了一个503错误,表示服务器暂时无法处理请求。这可能是由于Elasticsearch服务出现问题导致的。为了解决这个问题,可以尝试以下方法:
- 检查Elasticsearch服务是否正常运行,如果没有运行,请启动它。
- 检查Elasticsearch服务的日志,查找可能的错误信息。
- 如果Elasticsearch服务正常运行,但仍然出现503错误,可以尝试增加Elasticsearch服务的超时时间。在
elasticsearch.yml
配置文件中,找到http.timeout
设置,并将其值增加到一个较大的数值,例如60秒。然后重启Elasticsearch服务。 - 如果问题仍然存在,可以考虑重启整个应用程序,包括Elasticsearch和Milvus服务。
根据提供的错误信息,问题出现在 Milvus 和 Elasticsearch 的搜索阶段执行异常。具体来说,Elasticsearch 返回了一个 503 错误,表示服务器暂时无法处理请求。
要解决这个问题,可以尝试以下方法:
- 检查 Elasticsearch 服务是否正常运行。可以通过访问 Elasticsearch 的 API 或者使用
curl
命令来测试连接。例如:
curl http://localhost:9200
如果返回了 Elasticsearch 的基本信息,说明服务正常运行。如果返回了错误信息,需要检查 Elasticsearch 的配置和日志以确定问题所在。
- 检查 Milvus 服务是否正常运行。同样可以通过访问 Milvus 的 API 或者使用
curl
命令来测试连接。例如:
curl http://localhost:19530/api/v1/collections
如果返回了 Milvus 集合的基本信息,说明服务正常运行。如果返回了错误信息,需要检查 Milvus 的配置和日志以确定问题所在。
- 如果上述方法都无法解决问题,可以尝试重启 Elasticsearch 和 Milvus 服务。在 Linux 系统中,可以使用以下命令重启服务:
sudo systemctl restart elasticsearch
sudo systemctl restart milvus
- 如果问题仍然存在,可以考虑升级 Elasticsearch 和 Milvus 到最新版本,或者查看官方文档和社区寻求帮助。
3条答案
按热度按时间aor9mmx11#
从日志中可以看出,问题出现在与Elasticsearch的连接上。具体来说,在插入文件时出现了404错误。这可能是由于Elasticsearch的配置或者网络问题导致的。
首先,请检查Elasticsearch的配置是否正确。确保Elasticsearch服务正在运行,并且端口号与日志中的一致。如果不确定,请查阅Elasticsearch官方文档以获取正确的配置信息。
其次,检查网络连接。确保Milvus和Elasticsearch之间的网络是畅通的,没有防火墙或其他安全策略阻止它们之间的通信。
如果以上方法都无法解决问题,建议查看Elasticsearch的日志以获取更多详细信息。此外,您还可以尝试重启Elasticsearch服务,看是否能解决问题。
[2024-05-31 09:35:57 +0800] [861] [ERROR] Exception occurred while handling uri: ' http://localhost:8777/api/local_doc_qa/delete_files ' Traceback (most recent call last): File "handle_request", line 97, in handle_request File "/workspace/qanything_local/qanything_kernel/qanything_server/handler.py", line 247, in delete_docs milvus_kb.delete_files(file_ids) File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/milvus_client.py", line 288, in delete_files es_records = self.client.search(files_id, field='file_id') File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/es_client.py", line 182, in search response = self.client.search( File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client/utils.py", line 446, in wrapped return api(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client/init.py", line 3836, in search return self.perform_request( # type: ignore[return-value] File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client/_base.py", line 320, in perform_request raise HTTP_EXCEPTIONS.get(meta.status, ApiError)( elasticsearch.ApiError: ApiError(503, 'search_phase_execution_exception', None) ERROR:sanic.error:Exception occurred while handling uri: ' http://localhost:8777/api/local_doc_qa/delete_files ' Traceback (most recent call last): File "handle_request", line 97, in handle_request File "/workspace/qanything_local/qanything_kernel/qanything_server/handler.py", line 247, in delete_docs milvus_kb.delete_files(file_ids) File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/milvus_client.py", line 288, in delete_files es_records = self.client.search(files_id, field='file_id') File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/es_client.py", line 182, in search response = self.client.search( File "/usr/local/lib/python3.10
0%| | 0/1 [00:00<?, ?it/s] 100%|███████████| 1/1 [00:00<00:00, 52.80it/s] INFO:debug_logger:milvus group number: 1
INFO:elastic_transport.node_pool:Resurrected node <Urllib3HttpNode( http://es-container-local:9200)> %3E) (force=False)
INFO:elastic_transport.transport:HEAD http://es-container-local:9200/zzp%2B%2Bkb6fda41a1c95f4605ab812263d921740a [status:200 duration:0.001s] WARNING:elastic_transport.node_pool:Node <Urllib3HttpNode( http://es-container-local:9200)> %3E) has been marked alive after a successful request
INFO:elastic_transport.transport:POST http://es-container-local:9200)> [status:503 duration:0.002s] WARNING:elastic_transport.node_pool:Node <Urllib3HttpNode( http://es-container-local:9200)> %3E) has failed for 1 times in a row, putting on 1 second timeout
WARNING:elastic_transport.transport:Retrying request after non-successful status 503 (attempt 0 of 3)
INFO:elastic_transport.node_pool:Resurrected node <Urllib3HttpNode( http://es-container-local:9200)> %3E) (force=False)
INFO:elastic_transport.transport:POST http://es-container-local:9200/zzp%2B%2Bkb6fda41a1c95f4605ab812263d921740a/_search [status:503 duration:0.001s] WARNING:elastic_transport.node_pool:Node <Urllib3HttpNode( http://es-container-local:9200)> %3E) has failed for 2 times in a row, putting on 2 second timeout
WARNING:elastic_transport.transport:Retrying request after non-successful status 503 (attempt 1 of 3)
INFO:elastic_transport.node_pool:Resurrected node <Urllib3HttpNode( http://es-container-local:9200)> %3E) (force=False)
INFO:elastic_transport.transport:POST http://es-container-local:9200/zzp%2B%2Bkb6fda41a1c95f4605ab812263d921740a/_search [status:503 duration:0.001s] WARNING:elastic_transport.node_pool:Node <Urllib3HttpNode( http://es-container-local:9200)> %3E) has failed for 3 times in a row, putting on 4 second timeout
WARNING:elastic_transport.transport:Retrying request after non-successful status 503 (attempt 2 of 3)
INFO:elastic_transport.node_pool:Resurrected node <Urllib3HttpNode( http://es-container-local:9200/zzp%2B%2Bkb6fda41a1c95f4605ab812263d921740a/_search %3E) (force=False)
INFO:elastic_transport.transport:POST http://es-container-local:9200)> [status:503 duration:0.001s] WARNING:elastic_transport.node_pool:Node <Urllib3HttpNode( http://localhost:8777/api/local_doc_qa/local_doc_chat %3E) has failed for 4 times in a row, putting on 8 second timeout
[2022-05-31 09:36:07 +0800] [861] [ERROR] Exception occurred while handling uri: ' http://localhost:8777/api/local_doc_qa/local_doc_chat '
Traceback (most recent call last): File "handle_request", line 132, in handle_request "_asgi_lifespan", File "/usr/local/lib/python3.10/dist-packages/sanic/response/types.py", line 547, in stream await self.streaming_fn(self) File "/workspace/qanything_local/qanything_kernel/qanything_server/handler.py", line 355, in generate_answer for resp, next_history in local_doc_qa.get_knowledge_based_answer( File "/workspace/qanything_local/qanything_kernel/core/local_doc_qa.py", line 225, in get_knowledge_based_answer source_documents = self.get_source_documents(retrieval_queries, milvus_kb) File "/workspace/qanything_local/qanything_kernel/core/local_doc_qa.py", line 133, in get_source_documents batch_result = milvus_kb.search_emb_async(embs=embs, top_k=top_k, queries=queries) File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/milvus_client.py", line 181, in search_emb_async return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", in init(self, *args, **kwargs) File "/usr/lib/python3.10/concurrent/futures/executors.py", in call(self, *args, **kwargs) File "/usr/lib/python3.10/concurrent/futures/threads.py", in init(self, *args, **kwargs) File "/usr/lib/python3.10/concurrent/futures/tasks.py", in run(self, *args, **kwargs) File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/milvus_client.py", line 170, in __search_emb_sync es_records = self.client.search(queries) File "/usr/lib/python3.10/dist-packages/elasticsearch/_sync/client/utils.py", line 446, in wrapped return api(*args, **kwargs) File "/usr/lib/python3.10/dist-packages/elasticsearch/_sync/client/init.py", line 3836, in search return self.perform_request( # type: ignore[return-value] File "/usr/lib/python3.10/dist-packages/elasticsearch/_sync/client
复现方法 | Steps To Reproduce
按照文档下载源码,bash run.sh -c cloud -i 0 下载镜像,起容器,出现上述问题
备注 | Anything else?
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
按照5.17日的安装文档,docker部署,run.sh安装未报错,启动容器终端输出未见异常,访问UI可以正常访问,遇见两个问题: 1.无法正常上传文件,上传第一个文件一直报错:milvus插入失败,请稍后再试,第二个第三个没问题,但出现该问题后,该知识库无法正常删除,报错请求失败,该文件同样无法删除,docx,ppt,txt均尝试了。 2.无法正常问答,问答一直报错请刷新后重试。
测试了ollama,本地Qwen,应该不是模型问题,看sanic.log,问题应该出在es上面。
离线部署centos文档.docx ERROR: Connection timed out
INFO: insert time: 10.025246858596802
INFO: insert_to_milvus: success num: 1, failed num: 0
在2024年5月31日09点35分57秒,发生了一个错误。错误信息如下:
离线部署centos文档.docx, bbd20c38f53b473babbafe01f5aab6b7, success
离线部署centos文档.docx, bbd20c38f53b473babbafe01f5aab6b7, success
成功初始化本地文件 离线部署centos文档.docx
插入文件到milvus: KB6fda41a1c95f4605ab812263d921740a
匹配milvus_client: <qanything_kernel.connector.database.milvus.milvus_client.MilvusClient object at 0x7f37c10e3f70>
第二段拆分文档前长度: 170
第二段拆分文档后长度: 170
langchain分析内容头部: 搭建私有yum仓库,安装gcc gcc-c++,ffmpeg(三步法),nginx(nginx在centos8.
拆分时间: 0.04621243476867676 170
嵌入数量: 11
[2024-05-31 09:35:57 +0800] [861] [ERROR] Exception occurred while handling uri: ' http://localhost:8777/api/local_doc_qa/delete_files ' Traceback (most recent call last): File "handle_request", line 97, in handle_request File "/workspace/qanything_local/qanything_kernel/qanything_server/handler.py", line 247, in delete_docs milvus_kb.delete_files(file_ids) File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/milvus_client.py", line 288, in delete_files es_records = self.client.search(files_id, field='file_id') File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/es_client.py", line 182, in search response = self.client.search( File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client/utils.py", line 446, in wrapped return api(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client/init.py", line 3836, in search return self.perform_request( # type: ignore[return-value] File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client/_base.py", line 320, in perform_request raise HTTP_EXCEPTIONS.get(meta.status, ApiError)( elasticsearch.ApiError: ApiError(503, 'search_phase_execution_exception', None) ERROR:sanic.error:Exception occurred while handling uri: ' http://localhost:8777/api/local_doc_qa/delete_files ' Traceback (most recent call last): File "handle_request", line 97, in handle_request File "/workspace/qanything_local/qanything_kernel/qanything_server/handler.py", line 247, in delete_docs milvus_kb.delete_files(file_ids) File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/milvus_client.py", line 288, in delete_files es_records = self.client.search(files_id, field='file_id') File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/es_client.py", line 182, in search response = self.client.search( File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client/utils.py", line 446, in wrapped return api(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client/init.py", line 3836, in search return self.perform_request( # type: ignore[return-value] File "/usr/local/lib/python3.10/dist-packages/elasticsearch/_sync/client
下载镜像,启动容器,出现上述问题。
备注 | Anything else?
我也遇到了该问题。定位为磁盘容量超过95%时elasticsearch将转为只读模式,清理磁盘至95%以下后该问题解决。
t9eec4r02#
我也遇到了同样的问题。从日志中可以看出,Elasticsearch节点连续失败了4次,每次失败后都会增加超时时间。最后一次尝试(尝试3)仍然失败,导致了错误。
关于为什么没有暴露端口的问题,可能是因为Elasticsearch配置文件中的
network.host
设置为localhost
,或者防火墙限制了访问。请检查Elasticsearch的配置文件和防火墙设置,确保它们允许外部访问。njthzxwz3#
我也遇到了同样的问题,请问如何解决