llama_index [Bug]: DashScope 默认设置导致提示重新打包问题

zdwk9cvp  于 2个月前  发布在  其他
关注(0)|答案(1)|浏览(30)

Bug描述

dashscope llm module 具有以下默认设置:

DASHSCOPE_MODEL_META = {
    DashScopeGenerationModels.QWEN_TURBO: {
        "context_window": 1024 * 8,
        "num_output": 1024 * 8,
        "is_chat_model": True,
    },
    DashScopeGenerationModels.QWEN_PLUS: {
        "context_window": 1024 * 32,
        "num_output": 1024 * 32,
        "is_chat_model": True,
    },
    DashScopeGenerationModels.QWEN_MAX: {
        "context_window": 1024 * 8,
        "num_output": 1024 * 8,
        "is_chat_model": True,
    },
    DashScopeGenerationModels.QWEN_MAX_1201: {
        "context_window": 1024 * 8,
        "num_output": 1024 * 8,
        "is_chat_model": True,
    },
    DashScopeGenerationModels.QWEN_MAX_LONGCONTEXT: {
        "context_window": 1024 * 30,
        "num_output": 1024 * 30,
        "is_chat_model": True,
    },
}

其中 context_window 等于 num_output。
这个默认设置总是会导致 prompt_helper 中的 ValueError,因为 self.context_window = self.num_output,没有剩余空间给 num_prompt_tokens。

版本

0.10.55

重现步骤

pip install llama-index-core
pip install llama-index-indices-managed-dashscope
import os
from llama_index.indices.managed.dashscope import DashScopeCloudIndex
from llama_index.llms.dashscope import DashScope, DashScopeGenerationModels

os.environ["DASHSCOPE_API_KEY"] = "sk-xxx"

llm = DashScope(model_name=DashScopeGenerationModels.QWEN_MAX,
                            max_tokens=None,
                            incremental_output=False)
index = DashScopeCloudIndex(name='my_index')
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("test query")
print(response)

相关日志/回溯

Traceback (most recent call last):
  File "/mnt/workspace/nas-alinlp/ada.drx/search_llm/scripts/llama-index-dashscope-cloud/test/test_cust1.py", line 13, in <module>
    response = query_engine.query("test query")
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py", line 53, in query
    query_result = self._query(str_or_query_bundle)
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 190, in _query
    response = self._response_synthesizer.synthesize(
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/response_synthesizers/base.py", line 242, in synthesize
    response_str = self.get_response(
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/response_synthesizers/compact_and_refine.py", line 42, in get_response
    new_texts = self._make_compact_text_chunks(query_str, text_chunks)
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/response_synthesizers/compact_and_refine.py", line 57, in _make_compact_text_chunks
    return self._prompt_helper.repack(max_prompt, text_chunks)
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/indices/prompt_helper.py", line 276, in repack
    text_splitter = self.get_text_splitter_given_prompt(
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/indices/prompt_helper.py", line 234, in get_text_splitter_given_prompt
    chunk_size = self._get_available_chunk_size(
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/indices/prompt_helper.py", line 218, in _get_available_chunk_size
    available_context_size = self._get_available_context_size(num_prompt_tokens)
  File "/mnt/workspace/nas-alinlp/ada.drx/py_llama1034/lib/python3.10/site-packages/llama_index/core/indices/prompt_helper.py", line 150, in _get_available_context_size
    raise ValueError(
ValueError: Calculated available context size -75 was not non-negative.
insrf1ej

insrf1ej1#

要解决 prompt_helper 模块中由 ValueError 默认设置引起的 ValueError 问题,需要确保 context_window 大于 num_prompt_tokensnum_output 的和。当前的默认设置使得 context_window 等于 num_output,没有为 num_prompt_tokens 留出空间。
以下是如何调整 DASHSCOPE_MODEL_META 设置以避免此问题的步骤:

  1. 更新 DASHSCOPE_MODEL_META 字典:确保 context_window 足够大,以容纳提示令牌。
DASHSCOPE_MODEL_META = {
    DashScopeGenerationModels.QWEN_TURBO: {
        "context_window": 1024 * 8,
        "num_output": 1024 * 6,  # Adjusted to leave space for prompt tokens
        "is_chat_model": True,
    },
    DashScopeGenerationModels.QWEN_PLUS: {
        "context_window": 1024 * 32,
        "num_output": 1024 * 30,  # Adjusted to leave space for prompt tokens
        "is_chat_model": True,
    },
    DashScopeGenerationModels.QWEN_MAX: {
        "context_window": 1024 * 8,
        "num_output": 1024 * 6,  # Adjusted to leave space for prompt tokens
        "is_chat_model": True,
    },
    DashScopeGenerationModels.QWEN_MAX_1201: {
        "context_window": 1024 * 8,
        "num_output": 1024 * 6,  # Adjusted to leave space for prompt tokens
        "is_chat_model": True,
    },
    DashScopeGenerationModels.QWEN_MAX_LONGCONTEXT: {
        "context_window": 1024 * 30,
        "num_output": 1024 * 28,  # Adjusted to leave space for prompt tokens
        "is_chat_model": True,
    },
}
  1. 修改 DashScope 类以验证 model_name:确保使用的 model_name 是有效的,并存在于 DASHSCOPE_MODEL_META 字典中。
class DashScope(CustomLLM):
    # ... other code ...

    @property
    def metadata(self) -> LLMMetadata:
        if self.model_name not in DASHSCOPE_MODEL_META:
            raise ValueError(f"Model name '{self.model_name}' is not valid. Please choose from {list(DASHSCOPE_MODEL_META.keys())}.")
        
        DASHSCOPE_MODEL_META[self.model_name]["num_output"] = (
            self.max_tokens or DASHSCOPE_MODEL_META[self.model_name]["num_output"]
        )
        return LLMMetadata(
            model_name=self.model_name, **DASHSCOPE_MODEL_META[self.model_name]
        )

    # ... other code ...
  1. 确保环境变量设置正确:确保环境变量 DASHSCOPE_WORKSPACE_IDDASHSCOPE_API_KEYDASHSCOPE_BASE_URL 设置正确,因为这些对于 DashScopeCloudIndex 类正常运行是必需的。
class DashScopeCloudIndex(BaseManagedIndex):
    """DashScope Cloud Platform Index."""

    def __init__(
        self,
        name: str,
        nodes: Optional[List[BaseNode]] = None,
        transformations: Optional[List[TransformComponent]] = None,
        timeout: int = 60,
        workspace_id: Optional[str] = None,
        api_key: Optional[str] = None,
        base_url: Optional[str] = DASHSCOPE_DEFAULT_BASE_URL,
        show_progress: bool = False,
        callback_manager: Optional[CallbackManager] = None,
        **kwargs: Any,
    ) -> None:
        """Initialize the Platform Index."""
        self.name = name
        self.transformations = transformations or []

        if nodes is not None:
            raise ValueError(
                "DashScopeCloudIndex does not support nodes on initialization"
            )

        self.workspace_id = workspace_id or os.environ.get("DASHSCOPE_WORKSPACE_ID")
        self._api_key = api_key or os.environ.get("DASHSCOPE_API_KEY")
        self._base_url = os.environ.get("DASHSCOPE_BASE_URL", None) or base_url
        self._headers = {
            "Content-Type": "application/json",
            "Accept-Encoding": "utf-8",
            "X-DashScope-WorkSpace": self.workspace_id,
            "Authorization": "Bearer " + self._api_key,
            "X-DashScope-OpenAPISource": "CloudSDK",
        }
        self._timeout = timeout
        self._show_progress = show_progress
        self._service_context = None
        self._callback_manager = callback_manager or Settings.callback_manager

通过进行这些调整,你应该能够避免在 prompt_helper 模块中出现 x1m20n0xxe0f0x。

  • 要继续与 dosu 对话,请提及 @dosu。*

相关问题