Bug描述
在使用HuggingFaceEmbedding和FAISS运行最小脚本时出现分段错误。在没有任何其他安装的情况下,仅使用llama-index-embeddings-huggingface
和faiss-cpu
在新环境中触发故障就足够了。
$ poetry run python3 example.py
[1] 63240 segmentation fault poetry run python3 example.py
example.py
的内容如下:
import faiss
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
embed = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
print("success")
据我所知,传递给HuggingFaceEmbedding
的具体模型并没有影响。
可能与#9949有关,尽管该问题似乎与FAISS无关。
版本
0.10.51
重现步骤
从一个全新的目录开始:
poetry init
,使用Python 3.12并点击初始化菜单poetry add llama-index-embeddings-huggingface faiss-cpu
poetry run python3 example.py
(见上面的文件内容)
输出:
$ poetry run python3 example.py
[1] 63995 segmentation fault poetry run python3 example.py
如果你从example.py
中删除import faiss
这一行,输出将如预期那样:
$ poetry run python3 example.py
success
相关日志/回溯
启用DEBUG
日志记录后:
poetry run python3 example.py
DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
INFO:faiss.loader:Loading faiss.
Loading faiss.
INFO:faiss.loader:Successfully loaded faiss.
Successfully loaded faiss.
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: BAAI/bge-small-en-v1.5
Load pretrained SentenceTransformer: BAAI/bge-small-en-v1.5
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): huggingface.co:443
Starting new HTTPS connection (1): huggingface.co:443
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/modules.json HTTP/11" 200 0
https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/modules.json HTTP/11" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/config_sentence_transformers.json HTTP/11" 200 0
https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/config_sentence_transformers.json HTTP/11" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/README.md HTTP/11" 200 0
https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/README.md HTTP/11" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/modules.json HTTP/11" 200 0
https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/modules.json HTTP/11" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/sentence_bert_config.json HTTP/11" 200 0
https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/sentence_bert_config.json HTTP/11" 200 0
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/config.json HTTP/11" 200 0
https://huggingface.co:443 "HEAD /BAAI/bge-small-en-v1.5/resolve/main/config.json HTTP/11" 200 0
[1] 64618 segmentation fault poetry run python3 example.py
3条答案
按热度按时间gtlvzcf81#
嘿,@gordonhart!👋 我来帮助你解决任何遇到的bug、问题或想法。如果你需要帮助,请随时联系我。
要解决在MacOS上使用
HuggingFaceEmbedding
与FAISS时出现的段错误,你可以尝试显式地将device
设置为cpu
。这有助于避免与GPU使用相关的问题,这些问题可能导致段错误。以下是一个最小的脚本来演示这一点:import faiss
from llama_index.legacy.embeddings.huggingface import HuggingFaceEmbedding
Initialize the HuggingFaceEmbedding with device set to 'cpu'
embedding_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5",
device="cpu"
)
print("success")
ne5o7dgx2#
@dosu adding
device="cpu"
to theHuggingFaceEmbedding
initialization did not fix the issue.50pmv0ei3#
如果将
device
设置为"cpu"
不能解决问题,您可以尝试以下步骤来解决段错误:faiss-cpu
、torch
和transformers
库版本。推荐的版本如下:faiss-cpu
: 1.7.3torch
: 1.13.1transformers
: 4.25.1您可以使用以下命令安装这些特定版本:
DEBUG
日志级别以获取更详细的错误消息和回溯。这可以提供更多线索,了解可能导致段错误的可能原因。如果问题仍然存在,您可能需要检查 FAISS 和 MacOS 上的
transformers
库之间是否存在已知的兼容性问题。有一份关于 MacOS 上 FAISS 的文档记录了用户在运行 FAISS 示例笔记本 1(https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/discover_llamaindex/document_management/discord_dumps/help_channel_dump_06_02_23.json) 时遇到的TypeError
问题。按照这些步骤,您应该能够识别并解决段错误问题。