langchain 使用LlamaCpp的本地Llama-3模型对长文本进行总结时出现奇怪的输出,

tvokkenx  于 2个月前  发布在  Mac
关注(0)|答案(1)|浏览(78)

检查其他资源

  • 为这个问题添加了一个非常描述性的标题。
  • 使用集成搜索在LangChain文档中进行搜索。
  • 使用GitHub搜索查找类似的问题,但没有找到。
  • 我确信这是LangChain中的一个bug,而不是我的代码。
  • 通过更新到LangChain的最新稳定版本(或特定集成包)无法解决此bug。

示例代码

我正在尝试使用本地Llama-3 8B指导模型(GGUF版本)执行简单的文本摘要任务,并通过仅使用CPU返回JSON格式的结果。代码如下:

from langchain.chains import LLMChain
from langchain_community.llms import LlamaCpp
from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler
from langchain_core.prompts import PromptTemplate

# Create the prompt
template = """
              Read the article and return the "release date of Llama-3" in JSON format.
              If the information is not mentioned, please do not return any answer.
              Article: {text}
              Answer:
           """

# Text for summarization (from https://en.wikipedia.org/wiki/Llama_(language_model))
text = """
Llama (acronym for Large Language Model Meta AI, and formerly stylized as LLaMA) is a family of autoregressive large language models (LLMs) released by Meta AI starting in February 2023. The latest version is Llama 3, released in April 2024.

Model weights for the first version of Llama were made available to the research community under a non-commercial license, and access was granted on a case-by-case basis. Unauthorized copies of the model were shared via BitTorrent. In response, Meta AI issued DMCA takedown requests against repositories sharing the link on GitHub. Subsequent versions of Llama were made accessible outside academia and released under licenses that permitted some commercial use. Llama models are trained at different parameter sizes, typically ranging between 7B and 70B. Originally, Llama was only available as a foundation model. Starting with Llama 2, Meta AI started releasing instruction fine-tuned versions alongside foundation models.

Alongside the release of Llama 3, Meta added virtual assistant features to Facebook and WhatsApp in select regions, and a standalone website. Both services use a Llama 3 model.
"""

# Set up and run Local Llama-3 model
prompt = PromptTemplate(template=template, input_variables=["text"])
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
llm = LlamaCpp(model_path="model/llama/Meta-Llama-3-8B-Instruct.Q6_K.gguf",
      n_ctx=2048, callback_manager=callback_manager, verbose=True)
chain = prompt | llm
chain.invoke(text)

错误信息和堆栈跟踪(如有适用)

  • 无响应*

描述

通过使用这段代码,模型可以成功运行,输出结果也很好。

{
              "release_date": "April 2024"
            }

然而,如果我输入更多的文本(在网页上添加更多的段落( https://en.wikipedia.org/wiki/Llama_(language_model) )),输出就会变差,模型会不断生成结果:

The release notes for LLaMA model can be found on the official website, Meta AI.  Release notes are typically available after you read the answer.
LLaMA.   If you cannot
    it as is in.  Read More
    LLaMA is a "Release.  Release note the "Read the article.

# Release note the "read in.  Read more and more, Read the  Release on "read a "a
      Release in "Read the "Release
.
.
.

请问如果我想使用本地Llama-3模型对长文本进行摘要,是否有任何解决方案?

系统信息

langchain==0.2.10
langchain_community==0.2.9
langchain_core==0.2.22
Python版本 3.10.12

jslywgbw

jslywgbw1#

llm = LlamaCpp(model_path=... n_ctx=4048, callback_manager=callback_manager, verbose=True, temperature=0.01, rope_freq_base=0.0, rope_freq_scale=0.0)
添加rope_freq_base和rope_freq_scale是否有帮助?

相关问题