检查其他资源
- 我为这个问题添加了一个非常描述性的标题。
- 我使用集成搜索在LangChain文档中进行了搜索。
- 我使用GitHub搜索找到了一个类似的问题,但没有找到。
- 我确信这是LangChain中的一个错误,而不是我的代码。
- 通过更新到LangChain的最新稳定版本(或特定集成包)无法解决此错误。
示例代码
class VectorStoreCreator:
"""
A class to create a vector store from documents.
Methods
-------
create_vectorstore(documents, embed_model, filepath):
Creates a vector store from a set of documents using the provided embedding model.
"""
@staticmethod
def create_vectorstore(documents, embed_model, collection_name):
"""
Creates a vector store from a set of documents using the provided embedding model.
This function utilizes the Chroma library to create a vector store, which is a
data structure that facilitates efficient similarity searches over the document
embeddings. Optionally, a persistent directory and collection name can be specified
for storing the vector store on disk.
Parameters
----------
documents : list
A list of documents to be embedded and stored.
embed_model : object
The embedding model used to convert documents into embeddings.
filepath : str
The file path for persisting the vector store.
Returns
-------
object
A Chroma vector store instance containing the document embeddings.
"""
try:
# Create the vector store using Chroma
vectorstore = Chroma.from_texts(
texts=documents,
embedding=embed_model,
# persist_directory=f"chroma_db_{filepath}",
collection_name=f"{collection_name}"
)
logger.info("Vector store created successfully.")
return vectorstore
except Exception as e:
logger.error(f"An error occurred during vector store creation: {str(e)}")
return None
@staticmethod
def create_collection(file_name):
"""
Create a sanitized collection name from the given file name.
This method removes non-alphanumeric characters from the file name and truncates it to a maximum of 36 characters to form the collection name.
Args:
file_name (str): The name of the file from which to create the collection name.
Returns:
str: The sanitized and truncated collection name.
Raises:
Exception: If an error occurs during the collection name creation process, it logs the error.
"""
try:
collection_name = re.compile(r'[^a-zA-Z0-9]').sub('', file_name)[:36]
logger.info(f"A collection name created for the filename: {file_name} as {collection_name}")
return collection_name
except Exception as e:
logger.error(f"An errro occured during the collection name creation : {str(e)}")
@staticmethod
def delete_vectorstore(collection_name):
"""
Delete the specified vector store collection.
This method deletes a collection in the vector store identified by the collection name.
Args:
collection_name (str): The name of the collection to delete.
Returns:
None: This method does not return a value.
Raises:
Exception: If an error occurs during the deletion process, it logs the error.
"""
try:
Chroma.delete_collection()
return None
except Exception as e:
logger.error(f"An error occured during vector store deletion:{str(e)}")
return None
```
### Error Message and Stack Trace (if applicable)
_No response_
### Description
I am trying to delete the collection while using the chroma. But actually it's not working. Could anyone help me to fix this issues.
class VectorStoreCreator:
"""
从文档创建向量存储的类。
Methods
-------
create_vectorstore(documents, embed_model, filepath):
Creates a vector store from a set of documents using the provided embedding model.
"""
@staticmethod
def create_vectorstore(documents, embed_model, collection_name):
"""
Creates a vector store from a set of documents using the provided embedding model.
This function utilizes the Chroma library to create a vector store, which is a
data structure that facilitates efficient similarity searches over the document
embeddings. Optionally, a persistent directory and collection name can be specified
for storing the vector store on disk.
Parameters
----------
documents : list
A list of documents to be embedded and stored.
embed_model : object
The embedding model used to convert documents into embeddings.
filepath : str
The file path for persisting the vector store.
Returns
-------
object
A Chroma vector store instance containing the document embeddings.
"""
try:
# Create the vector store using Chroma
vectorstore = Chroma.from_texts(
texts=documents,
embedding=embed_model,
# persist_directory=f"chroma_db_{filepath}",
collection_name=f"{collection_name}"
)
logger.info("Vector store created successfully.")
return vectorstore
except Exception as e:
logger.error(f"An error occurred during vector store creation: {str(e)}")
return None
@staticmethod
def create_collection(file_name):
"""
Create a sanitized collection name from the given file name.
This method removes non-alphanumeric characters from the file name and truncates it to a maximum of 36 characters to form the collection name.
Args:
file_name (str): The name of the file from which to create the collection name.
Returns:
str: The sanitized and truncated collection name.
Raises:
Exception: If an error occurs during the collection name creation process, it logs the error.
"""
try:
collection_name = re.compile(r'[^a-zA-Z0-9]').sub('', file_name)[:36]
logger.info(f"A collection name created for the filename: {file_name} as {collection_name}")
return collection_name
except Exception as e:
logger.error(f"An errro occured during the collection name creation : {str(e)}")
@staticmethod
def delete_vectorstore(collection_name):
"""
Delete the specified vector store collection.
This method deletes a collection in the vector store identified by the collection name.
Args:
collection_name (str): The name of the collection to delete.
Returns:
None: This method does not return a value.
Raises:
Exception: If an error occurs during the deletion process, it logs the error.
"""
try:
Chroma.delete_collection()
return None
except Exception as e:
logger.error(f"An error occured during vector store deletion:{str(e)}")
return None
### System Info
langchain==0.1.10
1条答案
按热度按时间7jmck4yq1#
你好@rabin3030,
我在生产环境中遇到了同样的问题,这对我们的公司来说非常严重!
我们添加了许多向量/文档集合并经常更新它们。
问题在于,如果你仔细观察SQLite3数据库,所有已删除的信息(包括已删除的链接(外键))都会不断累积,但数据库会变得越来越大。
在很短的时间内,我们在ChromaDB数据库文件夹中达到了超过13 GB,服务器内存正在爆炸!
通过测试许多解决方案,我找到了一个奇怪且暂时的解决方案...
这里是:
为什么必须按顺序调用这两个删除操作才能正确清空数据?