llama_index [Bug]: KeyError: 'context_str' in query_engine.image_query(image_path, query_str)

wixjitnu  于 2个月前  发布在  其他
关注(0)|答案(1)|浏览(22)

在执行query_engine.image_query(image_path, query_str)函数时,出现了一个KeyError: 'context_str'错误。我认为问题出在llama_index/core/query_engine/multi_modal.py文件中的SimpleMultiModalQueryEngine类的_get_response_with_images()函数中,没有提供content_str

版本:0.10.50

重现步骤:

from llama_index.core.indices.multi_modal.base import (
 MultiModalVectorStoreIndex,
)
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import SimpleDirectoryReader, StorageContext
from llama_index.embeddings.clip import ClipEmbedding
import qdrant_client
client = qdrant_client.QdrantClient(path="qdrant_mm_wiki")
text_store = QdrantVectorStore(client=client, collection_name="text_collection")
image_store = QdrantVectorStore(client=client, collection_name="image_collection")
storage_context = StorageContext.from_defaults(vector_store=text_store, image_store=image_store)
from llama_index.core import Settings
Settings.embed_model = ClipEmbedding()
image_embed_model = ClipEmbedding()
documents = SimpleDirectoryReader("./mixed_wiki",recursive=True).load_data()
from llama_index.core.node_parser import SentenceSplitter
Settings.text_splitter = SentenceSplitter(chunk_size=60, chunk_overlap=5)
index = MultiModalVectorStoreIndex.from_documents(
 documents,
 storage_context=storage_context,
 transformations=[SentenceSplitter(chunk_size=60, chunk_overlap=5)],
 image_embed_model=image_embed_model,
)
from llama_index.multi_modal_llms.ollama import OllamaMultiModal
mm_model = OllamaMultiModal(model="llava:13b")
import numpy as np
from llama_index.core.prompts import PromptTemplate
from llama_index.core.query_engine import SimpleMultiModalQueryEngine
qa_tmpl_str = (
 "Given the images provided, "
 "answer the query.\n"
 "Query: {query_str}\n"
 "Answer: "
 )
qa_tmpl = PromptTemplate(qa_tmpl_str)
query_str = 'What is the main object in the picture?'

## query and response

query_engine = index.as_query_engine(llm=mm_model, text_qa_template=qa_tmpl)
response = query_engine.image_query('./mixed_wiki/1.jpg', query_str)

相关日志/跟踪信息:无响应

py49o6xq

py49o6xq1#

嘿,@StarVolcano!我在这里帮助你解决仓库中的任何错误、问题或贡献。在你等待人工维护者时,请告诉我如何帮助你。
要修复 KeyError: 'context_str' 函数中的 _get_response_with_images() ,你需要确保 context_str 键包含在 _image_qa_template 的格式字符串中。当前 _get_response_with_images() 的实现仅在格式字符串中包含 query_str ,这很可能是导致 KeyError 的原因。
以下是 _get_response_with_images() 函数的修正版本:

def _get_response_with_images(
    self,
    prompt_str: str,
    image_nodes: List[ImageNode],
) -> RESPONSE_TYPE:
    context_str = "\n\n".join([r.get_content() for r in image_nodes])
    fmt_prompt = self._image_qa_template.format(
        context_str=context_str, query_str=prompt_str
    )
    llm_response = self._multi_modal_llm.complete(
        prompt=fmt_prompt,
        image_documents=[image_node.node for image_node in image_nodes],
    )
    return Response(
        response=str(llm_response),
        source_nodes=image_nodes,
        metadata={"image_nodes": image_nodes},
    )

这确保了 context_str 包含在格式字符串中,防止了 KeyError 。此外,请确保 image_qa_template 已正确设置,以便接受 context_strquery_str 作为参数 1(https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/query_engine/multi_modal.py)2(https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/query_engine/multi_modal.py)。

  • 要继续与 @dosu 对话,请提及 @dosu。*

相关问题