langchain 使用AzureSearch矢量存储的MultiQuery Retriever始终返回无效的json格式错误,

cx6n0qe3 于 6个月前发布在其他

关注(0)|答案(2)|浏览(59)

错误信息和堆栈跟踪(如果适用)

异常已发生：OutputParserException
langchain_core.exceptions.OutputParserException: 无效的json输出：你能提供关于Llama-2的信息吗？
你能解释一下Llama-2的概念吗？
Llama-2是什么意思？
文件 Python\Python312\Lib\json\decoder.py,第353行，在scan_once函数中
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^
StopIteration: 0
在处理上述异常时，另一个异常发生：
文件 "Python\Python312\Lib\site-packages\langchain_core\output_parsers\json.py",第66行，在parse_result函数中
return parse_json_markdown(text)
^^^^^^^^^^^^^^^^^^^^^^^^
文件 \Python\Python312\Lib\site-packages\langchain_core\utils\json.py",第147行，在parse_json_markdown函数中
return parse_json(json_str, parser=parser)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
文件 \Python\Python312\Lib\site-packages\langchain_core\utils\json.py",第160行，在*parse_json函数中
return parser(json_str)
^^^^^^^^^
文件 Python\Python312\Lib\site-packages\langchain_core\utils\json.py",第120行，在parse_partial_json函数中
return json.loads(s, strict=strict)
^^^^^^^^^
文件 \Python\Python312\Lib\jsoninit*.py",第359行，在loads函数中
return cls(**kw).decode(s)
^
根据您提供的错误信息，问题出在langchain_core.exceptions.OutputParserException: Invalid json output。这意味着从LLM模型返回的输出不是有效的JSON格式。为了解决这个问题，您需要检查LLM模型的输出是否正确。

首先，确保您已经正确地调用了generate_queries方法并传入了查询参数。然后，检查get_relevant_documents方法的返回值是否包含有效的JSON格式数据。如果仍然无法解决问题，您可以尝试使用其他文本摘要库(如BERT、T5等)替换LLM模型，以排除模型本身的问题。

langchain

来源：https://github.com/langchain-ai/langchain/issues/23171

2条答案

按热度按时间

uwopmtnx1#

关于这个问题的任何修复/更新吗？

赞(0）回复(0）举报 6个月前

vc9ivgsu2#

@mdubeyAnsys@sameemqureshi 上述文档代码已过时。请使用以下代码：

# Build a sample vectorDB
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.embeddings import FakeEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load blog post
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

# Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits = text_splitter.split_documents(data)

# VectorDB
embedding = FakeEmbeddings(size=1352)
vectordb = Chroma.from_documents(documents=splits, embedding=embedding)

# replace with your llm
from langchain_community.llms.fake import FakeListLLM
# expected results from llm
responses = ["What methods are used for decomposing tasks?\nCan you describe different strategies for task decomposition?"]
# replace with your llm
llm = FakeListLLM(responses=responses)


from langchain.retrievers.multi_query import MultiQueryRetriever
retriever = MultiQueryRetriever.from_llm(llm=llm,retriever=vectordb.as_retriever())

# Results
unique_docs = retriever.invoke("What does the course say about regression?")
print(len(unique_docs))

赞(0）回复(0）举报 6个月前

我来回答

langchain 使用AzureSearch矢量存储的MultiQuery Retriever始终返回无效的json格式错误,

错误信息和堆栈跟踪(如果适用)

2条答案

相关问题

热门标签

最新问答