python-3.x ValueError:期望ID为非空列表,得到[],在色度中

qjp7pelc  于 11个月前  发布在  Python
关注(0)|答案(2)|浏览(374)
**ValueError:** Expected IDs to be a non-empty list, got []

**Traceback:**
File "C:\Users\scite\Desktop\HAMBOTAI\HAMBotAI\HAMBotAI\homehambotai.py", line 96, in app
    db = Chroma.from_documents(texts, embeddings)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\scite\AppData\Roaming\Python\Python311\site-packages\langchain_community\vectorstores\chroma.py", line 771, in from_documents
    return cls.from_texts(
           ^^^^^^^^^^^^^^^
File "C:\Users\scite\AppData\Roaming\Python\Python311\site-packages\langchain_community\vectorstores\chroma.py", line 729, in from_texts
    chroma_collection.add_texts(
File "C:\Users\scite\AppData\Roaming\Python\Python311\site-packages\langchain_community\vectorstores\chroma.py", line 324, in add_texts
    self._collection.upsert(
File "C:\Users\scite\AppData\Roaming\Python\Python311\site-packages\chromadb\api\models\Collection.py", line 449, in upsert
    ) = self._validate_embedding_set(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\scite\AppData\Roaming\Python\Python311\site-packages\chromadb\api\models\Collection.py", line 512, in _validate_embedding_set
    valid_ids = validate_ids(maybe_cast_one_to_many_ids(ids))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\scite\AppData\Roaming\Python\Python311\site-packages\chromadb\api\types.py", line 228, in validate_ids
    raise ValueError(f"Expected IDs to be a non-empty list, got {ids}")

字符串

代码块:

if 'processed' in query_params:
    # Create a temporary text file
    with tempfile.NamedTemporaryFile(mode="w", delete=False, suffix=".txt") as temp_file:
        temp_file.write(text)
        temp_file_path = temp_file.name

    # load document
    loader = TextLoader(temp_file_path)
    documents = loader.load()
    # split the documents into chunks
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    texts = text_splitter.split_documents(documents)
    # select which embeddings we want to use
    embeddings = OpenAIEmbeddings()
    # ids =[str(i) for i in range(1, len(texts) + 1)]
    # create the vectorestore to use as the index
    db = Chroma.from_documents(texts, embeddings)
    # expose this index in a retriever interface
    retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": 2})
    # create a chain to answer questions
    qa = ConversationalRetrievalChain.from_llm(OpenAI(), retriever)
    chat_history = []
    # query = "What's the Name of patient and doctor as mentioned in the data?"
    # result = qa({"question": query, "chat_history": chat_history})
    # st.write("Patient and Doctor name:", result['answer'])
    #
    # chat_history = [(query, result["answer"])]
    query = "Provide summary of medical and health related info from this data in points, every point should be in new line (Formatted in HTML)?"
    result = qa({"question": query, "chat_history": chat_history})
    toshow = result['answer']
    # chat_history = [(query, result["answer"])]
    # chat_history.append((query, result["answer"]))
    # print(chat_history)

    st.title("Data Fetched From Your Health & Medical Reports")
    components.html(
        f"""
        {toshow}
        """,
        height=250,
        scrolling=True,
    )

    if st.button('Continue to Questionarrie'):
        st.write('Loading')
    st.text("(OR)")
    if st.button('Chat with BotAI'):
        st.title("Chat with BotAI")


我成功地能够从llm得到我的问题的答案,但只要我点击下面的任何按钮,“继续浏览”/“与BotAI聊天”,它给出了如上所示的错误,但它不应该出现.我想确定什么是主要原因,我如何才能删除此错误.

6ljaweal

6ljaweal1#

错误消息ValueError: Expected IDs to be a non-empty list, got []有点令人困惑,因为实际问题是documents是空列表,ids是基于以下文档创建的:

# texts created based on documents in Chroma.from_documents
texts = [doc.page_content for doc in documents]

# ids created based on texts in Chroma.add_texts
if ids is None:
   ids = [str(uuid.uuid1()) for _ in texts]

字符串
您可以使用以下代码重现错误:

from langchain.vectorstores import Chroma

vectordb = Chroma.from_documents(documents=[])


在您的例子中,我假设text是一个空字符串"",当使用CharacterTextSplitter拆分documents时,会导致一个空列表texts
为了避免这种情况,添加一个检查以确保text不为空:

if text and 'processed' in query_params:
   # your code

5w9g7ksd

5w9g7ksd2#

我测试了你的代码,它可以工作:


的数据
我点击按钮没有错误抛出,它打印“聊天与BotAi `标题。这是进口:

import streamlit as st
from langchain.document_loaders import TextLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.chains import ConversationalRetrievalChain
from langchain.llms import OpenAI
import streamlit.components.v1 as components
import os
os.environ["OPENAI_API_KEY"] = "sk-yourApiKey"

字符串
版本:

Name: streamlit
Version: 1.25.0

Name: langchain
Version: 0.0.345

相关问题