llama_index [问题]:我可以一次在Mongo中使用多个集合吗?

zujrkrfu  于 4个月前  发布在  Go
关注(0)|答案(1)|浏览(44)

问题验证

  • 我已经在文档和Discord上寻找答案。

问题

我想从Mongo中使用多个集合来检索数据。这是我现在的方法。
store = MongoDBAtlasVectorSearch(client, db_name, collection_name, index_name)
index = VectorStoreIndex.from_vector_store(vector_store=store)
query_engine = index.as_query_engine()

km0tfn4u

km0tfn4u1#

是的,你可以使用MongoDB中的多个集合来检索数据。要实现这一点,你可以扩展 MongoDBAtlasVectorSearch 类以支持多个集合。以下是如何修改类的示例:

class MongoDBAtlasVectorSearch(VectorStore):
    def __init__(
        self,
        mongodb_client: Optional[Any] = None,
        db_name: str = "default_db",
        collection_names: List[str] = ["default_collection"],
        index_name: str = "default",
        id_key: str = "id",
        embedding_key: str = "embedding",
        text_key: str = "text",
        metadata_key: str = "metadata",
        insert_kwargs: Optional[Dict] = None,
        **kwargs: Any,
    ) -> None:
        import_err_msg = "`pymongo` package not found, please run `pip install pymongo`"
        try:
            from importlib.metadata import version
            from pymongo import MongoClient
            from pymongo.driver_info import DriverInfo
        except ImportError:
            raise ImportError(import_err_msg)

        if mongodb_client is not None:
            self._mongodb_client = cast(MongoClient, mongodb_client)
        else:
            if "MONGO_URI" not in os.environ:
                raise ValueError(
                    "Must specify MONGO_URI via env variable "
                    "if not directly passing in client."
                )
            self._mongodb_client = MongoClient(
                os.environ["MONGO_URI"],
                driver=DriverInfo(name="llama-index", version=version("llama-index")),
            )

        self._collections = {name: self._mongodb_client[db_name][name] for name in collection_names}
        self._current_collection = self._collections[collection_names[0]]
        self._index_name = index_name
        self._embedding_key = embedding_key
        self._id_key = id_key
        self._text_key = text_key
        self._metadata_key = metadata_key
        self._insert_kwargs = insert_kwargs or {}

    def switch_collection(self, collection_name: str) -> None:
        if collection_name in self._collections:
            self._current_collection = self._collections[collection_name]
        else:
            raise ValueError(f"Collection {collection_name} not found.")

    # Modify methods to use self._current_collection instead of self._collection

此修改允许您使用 switch_collection 方法在多个集合之间切换。您还需要调整您的方法,使其使用 self._current_collection 而不是 self._collection1(https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/vector_stores/llama-index-vector-stores-mongodb/llama_index/vector_stores/mongodb/base.py)2(https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/vector_stores/mongodb.py)。

  • 要继续与 Dosu 交谈,请提及 @dosu。*

相关问题