langchain LiteLLM - 当使用异步流API时，cache_hit标志报告为None,

gz5pxeao 于 6个月前发布在其他

关注(0)|答案(2)|浏览(54)

检查其他资源

我为这个问题添加了一个非常描述性的标题。
我在集成搜索中搜索了LangChain文档。
我使用GitHub搜索找到了一个类似的问题，但没有找到。
我确信这是LangChain中的一个bug,而不是我的代码。
通过更新到LangChain的最新稳定版本(或特定集成包)无法解决此bug。

示例代码

import asyncio
import litellm

from langchain_community.chat_models.litellm_router import ChatLiteLLMRouter
from langchain_core.messages import HumanMessage
from langchain_core.prompt_values import ChatPromptValue
from litellm import Router
from litellm.integrations.custom_logger import CustomLogger

def get_llm_router() -> Router:
    """
Return a new instance of Router, ensure to pass the following parameters so responses are cached:
* redis_host
* redis_port
* redis_password
* cache_kwargs
* cache_responses
* caching_groups
"""
    raise NotImplementedError('Create your own router')

class MyLogger(CustomLogger):
    async def async_log_success_event(self, kwargs, response_obj: "ModelResponse", start_time, end_time):
        print(f"[MyLogger::async_log_success_event] response id: '{response_obj.id}'; cache_hit: '{kwargs.get('cache_hit', '')}'.\n\n")

my_logger = MyLogger()
litellm.callbacks = [my_logger]

async def chat():
    llm = ChatLiteLLMRouter(router=get_llm_router())

    msg1 = ""
    msg1_count = 0
    async for msg in llm.astream(
            input=ChatPromptValue(messages=[HumanMessage("What's the first planet in solar system?")])):
        msg1 += msg.content
        if msg.content:
            msg1_count += 1

    print(f"msg1 (count={msg1_count}): {msg1}\n\n")

    msg2 = ""
    msg2_count = 0
    async for msg in llm.astream(input=ChatPromptValue(messages=[HumanMessage("What's the first planet in solar system?")])):
        msg2 += msg.content
        if msg.content:
            msg2_count += 1

    print(f"msg2 (count={msg2_count}): {msg2}\n\n")

    await asyncio.sleep(5)

if __name__ == "__main__":
    asyncio.run(chat())

错误消息和堆栈跟踪(如果适用)

这是运行共享代码时生成的输出：

Intialized router with Routing strategy: latency-based-routing

Routing fallbacks: None

Routing context window fallbacks: None

Router Redis Caching=<litellm.caching.RedisCache object at 0x12370da10>
msg1 (count=20): The first planet in the solar system, starting from the one closest to the Sun, is Mercury.

[MyLogger::async_log_success_event] response id: 'chatcmpl-9jnacYSdnczh2zWMKi3l813lNXVtE'; cache_hit: 'None'.

msg2 (count=1): The first planet in the solar system, starting from the one closest to the Sun, is Mercury.

[MyLogger::async_log_success_event] response id: 'chatcmpl-9jnacYSdnczh2zWMKi3l813lNXVtE'; cache_hit: 'None'.

请注意以 [MyLogger::async_log_success_event] 开头的两行，说 cache_hit: 'None' 。在第二行中，由于对 astream 的调用生成了一个包含整个消息的单个块，因此 True 是预期的结果。

描述

我正在尝试使用LiteLLM路由器缓存设置缓存LLM响应，并在从缓存而不是LLM获取响应时收到通知。为此目的，我在LiteLLM文档中实现了一个自定义日志记录器，如下所示。
问题在于，当我像上面的代码片段那样调用 astream API时，即使对于从缓存返回响应的情况， cache_hit 标志也是 None 。当我调用 ainvoke API( await llm.ainvoke(...) )时， cache_hit 标志作为 True 传递给我的自定义日志记录器，这在第二次调用 ainvoke 之后是预期的结果。

系统信息

$ poetry run python -m langchain_core.sys_info

System Information
------------------
> OS:  Darwin
> OS Version:  Darwin Kernel Version 23.2.0: Wed Nov 15 21:54:10 PST 2023; root:xnu-10002.61.3~2/RELEASE_X86_64
> Python Version:  3.11.7 (main, Dec 15 2023, 12:09:04) [Clang 14.0.6 ]

Package Information
-------------------
> langchain_core: 0.2.13
> langchain: 0.2.7
> langchain_community: 0.2.7
> langsmith: 0.1.85
> langchain_openai: 0.1.15
> langchain_text_splitters: 0.2.0

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langgraph
> langserve

langchain

来源：https://github.com/langchain-ai/langchain/issues/24120