检查其他资源
- 我为这个问题添加了一个非常描述性的标题。
- 我在LangChain文档中使用集成搜索进行搜索。
- 我使用GitHub搜索找到了一个类似的问题,但没有找到。
- 我确信这是LangChain中的一个bug,而不是我的代码。
- 通过更新到LangChain的最新稳定版本(或特定集成包)无法解决此bug。
示例代码
import os
import json
from pathlib import Path
from langchain_community.cache import SQLiteCache
from typing import Callable, List
model_list = [
'ChatAnthropic', # <- has several instance of this bug, not only SQLiteCache
'ChatBaichuan',
'ChatCohere',
'ChatCoze',
'ChatDeepInfra',
'ChatEverlyAI',
'ChatFireworks',
'ChatFriendli',
'ChatGooglePalm',
'ChatHunyuan',
'ChatLiteLLM',
'ChatOctoAI',
'ChatOllama',
'ChatOpenAI',
'ChatPerplexity',
'ChatYuan2',
'ChatZhipuAI'
# Below are the models I didn't test, as well as the reason why I haven't
# 'ChatAnyscale', # needs a model name
# 'ChatDatabricks', # needs some params
# 'ChatHuggingFace', # needs a modelname
# 'ChatJavelinAIGateway', # needs some params
# 'ChatKinetica', # not installed
# 'ChatKonko', # not installed
# 'ChatLiteLLMRouter', # needs router arg
# 'ChatLlamaCpp', #needs some params
# 'ChatMLflowAIGateway', # not installed
# 'ChatMaritalk', # needs some params
# 'ChatMlflow', # not installed
# 'ChatMLX', # needs some params
# 'ChatPremAI', # not installed
# 'ChatSparkLLM', # issue with api key
# 'ChatTongyi', # not installed
# 'ChatVertexAI', # not insalled
# 'ChatYandexGPT', # needs some params
]
# import the models
for m in model_list:
exec(f"from langchain_community.chat_models import {m}")
# set fake api keys
for m in model_list:
backend = m[4:].upper()
os.environ[f"{backend}_API_KEY"] = "aaaaaa"
os.environ[f"{backend}_API_TOKEN"] = "aaaaaa"
os.environ[f"{backend}_TOKEN"] = "aaaaaa"
os.environ["GOOGLE_API_KEY"] = "aaaaaa"
os.environ["HUNYUAN_APP_ID"] = "aaaaaa"
os.environ["HUNYUAN_SECRET_ID"] = "aaaaaa"
os.environ["HUNYUAN_SECRET_KEY"] = "aaaaaa"
os.environ["PPLX_API_KEY"] = "aaaaaa"
os.environ["IFLYTEK_SPARK_APP_ID"] = "aaaaaa"
os.environ["SPARK_API_KEY"] = "aaaaaa"
os.environ["DASHSCOPE_API_KEY"] = "aaaaaa"
os.environ["YC_API_KEY"] = "aaaaaa"
# create two brand new cache
Path("test_cache.db").unlink(missing_ok=True)
c1 = SQLiteCache(database_path="test_cache.db")
c2 = SQLiteCache(database_path="test_cache.db")
def recur_dict_check(val: dict) -> List[str]:
"find which object is causing the issue"
found = []
for k, v in val.items():
if " object at " in str(v):
if isinstance(v, dict):
found.append(recur_dict_check(v))
else:
found.append(v)
# flatten the list
out = []
for f in found:
if isinstance(f, list):
out.extend(f)
else:
out.append(f)
assert out
out = [str(o) for o in out]
return out
def check(chat_model: Callable, verbose: bool = False) -> bool:
"check a given chatmodel"
llm1 = chat_model(
cache=c1,
)
llm2 = chat_model(
cache=c2,
)
backend = llm1.get_lc_namespace()[-1]
str1 = llm1._get_llm_string().split("---")[0]
str2 = llm2._get_llm_string().split("---")[0]
if verbose:
print(f"LLM1:\n{str1}")
print(f"LLM2:\n{str2}")
if str1 == str2:
print(f"{backend.title()} does not have the bug")
return True
else:
print(f"{backend.title()} HAS the bug")
j1, j2 = json.loads(str1), json.loads(str2)
assert j1.keys() == j2.keys()
diff1 = recur_dict_check(j1)
diff2 = recur_dict_check(j2)
assert len(diff1) == len(diff2)
diffs = [str(v).split("object at ")[0] for v in diff1 + diff2]
assert all(diffs.count(elem) == 2 for elem in diffs)
print(f"List of buggy objects for model {backend.title()}:")
for d in diff1:
print(f" - {d}")
# for k, v in j1
return False
failed = []
for model in model_list:
if not check(locals()[model]):
failed.append(model)
print(f"The culprit is at least SQLiteCache repr string:\n{c1}\n{c2}")
c1.__class__.__repr__ = lambda x=None : "<langchain_community.cache.SQLiteCache>"
c2.__class__.__repr__ = lambda x=None : "<langchain_community.cache.SQLiteCache>"
print(f"Now fixed:\n{c1}\n{c2}\n")
# Anthropic still has issues
assert not check(locals()["ChatAnthropic"])
for model in failed:
if model == "ChatAnthropic": # anthropic actually has more issues!
continue
assert check(locals()[model]), model
print("Fixed it for most models!")
print(f"Models with the issue: {len(failed)} / {len(model_list)}")
for f in failed:
print(f" - {f}")
错误信息和堆栈跟踪(如果适用)
- 无响应*
描述
受到this bug的影响,在my DocToolsLLM project中,我最终使用了ChatOpenAI,而不仅仅是ChatLiteLLM,如果模型无论如何都是由openai提供的。
有一天我注意到我的SQLiteCcache系统地被ChatOpenAI忽略,最后找出了罪魁祸首:
- 要判断缓存中是否存在值,需要使用提示和描述LLM的字符串字符。
- 用于描述LLM的方法是
_get_llm_string()
- 此方法在聊天模型中的实现不一致,导致输出包含未过滤的repr对象,例如缓存、回调等。
- 问题在于,对于许多示例,repr返回类似于
<langchain_community.cache.SQLiteCache object at SOME_ADRESS>
的内容。 - 我发现手动设置这些对象超类的repr是一个可行的解决方法。
为了帮助您尽快解决此问题,我编写了一个循环,检查所有聊天模型并告诉您哪个示例导致了问题。
系统信息
python -m langchain_core.sys_info
系统信息
操作系统:Linux
OS版本:#35 ~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May 7 09:00:52 UTC 2
Python版本:3.11.7 (main, Jun 12 2024, 12:57:34) [GCC 11.4.0]
软件包信息
langchain_core: 0.2.7
langchain: 0.2.5
langchain_community: 0.2.5
langsmith: 0.1.77
langchain_mistralai: 0.1.8
langchain_openai: 0.1.8
langchain_text_splitters: 0.2.1
没有安装的软件包(不一定是个问题)
以下软件包未找到:
langgraph
langserve
6条答案
按热度按时间unhi4e5o1#
重新打开后,仔细阅读了一点。如果你将来分享一个最小示例,最好分享示例本身,然后提供任何实用代码来识别更多情况。实用代码使用了一系列功能(例如,
exec
)--这使得在第一次阅读时看起来像垃圾邮件tyg4sfes2#
我正在写一个最小示例。
brgchamk3#
你好,@thiswillbeyourgithub,谢谢你!
我已经在当地确认过了,所以我们已经准备好了:)
5vf7fwbs4#
好的,抱歉进行了如此广泛的复制,但起初我在ChatOpenAI上展示了这个,然后注意到这个问题非常广泛(影响至少7个聊天模型),并且不仅与缓存有关,有时还与其他属性有关,例如您看到的Anthropic。
rdrgkggo5#
所以我的原始代码有点笨拙,但可以快速查看哪个模型的哪些属性造成了问题。
w1jd8yoj6#
问题在这里:
langchain/libs/core/langchain_core/language_models/chat_models.py
第393行 61daa16
| | llm_string=dumps(self) |
可能受到任何其他辅助对象(例如,客户端)的影响。