langchain ChatHuggingFace总是返回100个令牌作为响应，而不考虑max_new_tokens参数,

mpgws1up 于 5个月前发布在其他

关注(0)|答案(5)|浏览(70)

检查其他资源

为这个问题添加了一个非常描述性的标题。
使用集成搜索在LangChain文档中查找。
使用GitHub搜索找到一个类似的问题，但没有找到。
我确信这是LangChain中的一个bug,而不是我的代码。
通过更新到LangChain的最新稳定版本(或特定集成包)无法解决此bug。

示例代码

from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace
from langchain_community.callbacks import get_openai_callback

llm = HuggingFaceEndpoint(
    repo_id=repo_id, temperature=0.01, max_new_tokens=2048, huggingfacehub_api_token=HUGGINGFACE_API_KEY)
llm = ChatHuggingFace(llm=llm)
messages = [
    ("system", "You are a smart AI that understand the tabular data structure."),
    ("user", f"{prompt}"),
]
with get_openai_callback() as cb:
    response = llm.invoke(messages)
    print(cb)
    if not isinstance(response, str):
        response = response.content
print(response)

****(已用令牌数：1668)

提示令牌数：1568
完成令牌数：100
成功请求：1

langchain

来源：https://github.com/langchain-ai/langchain/issues/25219

5条答案

按热度按时间

rhfm7lfc1#

我正在Llama-3.1-8B-Instruct模型上面临类似的问题。我们是否可以增加响应令牌限制，使其超过100?@npn-zakipoint

赞(0）回复(0）举报 5个月前

vh0rcniy2#

@AnandUgale ,我找不到任何方法来获取超过100个令牌。但是你可以通过直接使用HuggingFaceEndpoint调用llm,而不是使用ChatHuggingFace来实现这一点。但是，我发现它经常产生幻觉，因为ChatHuggingFace可能会使用指令调优模型而不是基础模型。

赞(0）回复(0）举报 5个月前

apeeds0o3#

我曾经遇到过类似的问题。尝试将 max_tokens 再次传递给 ChatHuggingFace。根据我的经验，这可能会覆盖你传递给 HuggingFaceEndPoint 的 max_tokens。虽然我只对 HuggingFacePipeline 有经验。希望它能起作用！

赞(0）回复(0）举报 5个月前

inb24sb24#

你好@Soumil32,
感谢你的建议。对于本地下载的模型，它要求提供一个令牌或Hugging Face API密钥。我尝试修改huggingface.py,但现在我遇到了以下错误：
ValueError: Cannot use apply_chat_template() because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating.

赞(0）回复(0）举报 5个月前

fslejnso5#

你好@Soumil32,
感谢你的建议，我已经通过将max_tokens设置为ChatHuggingFace进行了这些实验，但这并不能帮助解决上述问题。我仍然得到100个令牌作为响应。问题仍然存在。

赞(0）回复(0）举报 5个月前