Langchain检索QA与来源链抛出ValueError:缺少一些输入键:{'context'}

ecfsfe2w  于 3个月前  发布在  其他
关注(0)|答案(3)|浏览(33)

检查其他资源

  • 为这个问题添加了一个非常描述性的标题。
  • 使用集成搜索在LangChain文档中搜索。
  • 使用GitHub搜索查找类似的问题,但没有找到。
  • 我确信这是LangChain中的一个bug,而不是我的代码。
  • 通过更新到LangChain的最新稳定版本(或特定集成包)无法解决此bug。

示例代码

python
from langchain.chains import RetrievalQAWithSourcesChain
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    PromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import Redis

from chatbot_api import config

_INDEX_NAME = "Postmarket"

rds = Redis.from_existing_index(
    embedding=config.OPEN_AI_EMBEDDINGS,
    index_name=_INDEX_NAME,
    schema=config.INDEX_SCHEMA,
    redis_url=config.REDIS_URL,
)

_template = """Your job is to use information on the documents
to answer questions about postmarket operations. Use the following
context to answer questions. Be as detailed as possible, but don't
make up any information that's not from the context. If you don't
know an answer, say you don't know. If you refer to a document, cite
your reference.
{context}
"""

system_prompt = SystemMessagePromptTemplate(
    prompt=PromptTemplate(input_variables=['context'], template=_template)
)

human_prompt = HumanMessagePromptTemplate(
    prompt=PromptTemplate(input_variables=['question'], template="{question}")
)
messages = [system_prompt, human_prompt]

postmarket_prompt = ChatPromptTemplate(input_variables=['context', 'question'], messages=messages)

postmarket_chain = RetrievalQAWithSourcesChain.from_chain_type(
    llm=ChatOpenAI(model=config.QA_MODEL, temperature=config.TEMPERATURE),
    chain_type="stuff",
    retriever=rds.as_retriever(search_type="similarity", search_kwargs={"k": 8}),
    return_source_documents=True,
    # chain_type_kwargs={"prompt": postmarket_prompt}, # this also doesn't work throwing ValueError -> document_variable_name summaries was not found in llm_chain input_variables: ['context', 'question']
    verbose=True,
)
postmarket_chain.combine_documents_chain.llm_chain.prompt = postmarket_prompt

然后,我在我定义的langchain代理中的工具中使用的postmarket_chain被用作func=postmarket_chain.invoke

错误消息和堆栈跟踪(如果适用)

[chain/start] [chain:AgentExecutor > tool:Premarket > chain:RetrievalQAWithSourcesChain] Entering Chain run with input:
{
  "question": "What are the procedures for submitting an application for a new medical device?",
  "history": []
}
[chain/start] [chain:AgentExecutor > tool:Premarket > chain:RetrievalQAWithSourcesChain > chain:StuffDocumentsChain] Entering Chain run with input:
[inputs]
[chain/start] [chain:AgentExecutor > tool:Premarket > chain:RetrievalQAWithSourcesChain > chain:StuffDocumentsChain > chain:LLMChain] Entering Chain run with input:
{
  "question": "What are the procedures for submitting an application for a new medical device?",
  "summaries": "Content: Page 12D. Promotional Literature\nAny (I'm cutting the rest but this text is fetched from my vectorstore, I can confirm)" 
}
[llm/start] [chain:AgentExecutor > tool:Premarket > chain:RetrievalQAWithSourcesChain > chain:StuffDocumentsChain > chain:LLMChain > llm:ChatOpenAI] Entering LLM run with input:
{
  "prompts": [
    "System: Your job is to use information on documents\nto answer questions about premarket operations. Use the following\ncontext to answer questions. Be as detailed as possible, but don't\nmake up any information that's not from the context. If you don't\nknow an answer, say you don't know. If you refer to a document, cite\nyour reference.\n{context}\n\nHuman: What are the procedures for submitting an application for a new medical device?"
  ]
}
[llm/end] [chain:AgentExecutor > tool:Premarket > chain:RetrievalQAWithSourcesChain > chain:StuffDocumentsChain > chain:LLMChain > llm:ChatOpenAI] [5.16s] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": "I don't have the specific documents or guidelines available in the provided context to detail the procedures for submitting a 510(k) notification for a new medical device. Typically, this process involves preparing and submitting a premarket notification to the FDA to demonstrate that the new device is substantially equivalent to a legally marketed device (predicate device) not subject to premarket approval (PMA). The submission includes information about the device, its intended use, and comparative analyses, among other data. For detailed steps and requirements, it is best to refer directly to the relevant FDA guidelines or documents.",
        "generation_info": {
          "finish_reason": "stop",
          "logprobs": null
        },
        "type": "ChatGeneration",
        "message": {
          "lc": 1,
          "type": "constructor",
          "id": [
            "langchain",
            "schema",
            "messages",
            "AIMessage"
          ],

描述

我有一个多模型RAG系统,它使用从数百个PDF中解析的文本生成答案。这些PDF是从我的Redis向量存储检索的。我还有几个链(RetrievalQAWithSourcesChain)来从向量存储中查找相关的上下文文本并将其附加到聊天机器人llm调用中。我在正确地将上下文添加到系统提示方面遇到了问题。以下代码抛出ValueError:缺少一些输入键:{context} .
RetrievalQAWithSourcesChain应该使用Redis检索器并将提取的文本附加到{context},但似乎它不能这样做,或者我看不到其他什么。
当我在提示中使用双括号将'context'括起来时,它出乎意料地可以工作。然而,当我查看langchain尝试使用代理工具生成答案的中间步骤的日志时,我的理解是上下文甚至没有传递给知识窗口,llm模型只是使用自己的知识给出答案,而没有任何应该从向量存储传递的上下文信息。以下是一些日志。请注意,从向量存储返回的一些文本数据包含在摘要中,但当StuffDocumentsChain将此传递给llm:ChatOpenAI时,您可以看到它没有注入到系统提示中(向右滚动以查看),上下文字段仍然保持为{context} (它去掉了外括号)
我对上下文是否正确传递给知识窗口的假设是否正确?我该如何解决这个问题?我看到的其他项目示例在使用上下文时都使用一个括号将其包含在系统提示中。然而,我只能使代码在使用双括号时正常工作,这似乎根本没有将上下文注入其中...
这可能是由于我在创建向量存储时使用的索引模式吗?关于参考的模式:

text:
- name: content
- name: source
numeric:
- name: start_index
- name: page
vector:
- name: content_vector
  algorithm: HNSW
  datatype: FLOAT32
  dims: 384
  distance_metric: COSINE

系统信息

langchain==0.2.7
langchain-community==0.2.7
langchain-core==0.2.16
langchain-openai==0.1.15
langchain-text-splitters==0.2.2
langchainhub==0.1.20
Python 3.12.4
OS: MacOS Sonoma 14.4.1

rur96b6h

rur96b6h1#

我可以处理这个吗?

lkaoscv7

lkaoscv72#

What do you mean? 😅

8i9zcol2

8i9zcol23#

@havkerboi123 you are not passing the correct input variable for the document variable in your prompt. Just do this and it should work.

postmarket_chain = RetrievalQAWithSourcesChain.from_chain_type(
   llm=ChatOpenAI(model=config.QA_MODEL, temperature=config.TEMPERATURE),
    chain_type="stuff",
    retriever=rds.as_retriever(search_type="similarity", search_kwargs={"k": 8}),
    return_source_documents=True,
    verbose=True,
    chain_type_kwargs = {'document_variable_name':'context','prompt':postmarket_prompt}
)

相关问题