检查其他资源

我为这个问题添加了一个非常描述性的标题。
我在集成搜索中搜索了LangChain文档。
我使用GitHub搜索找到了一个类似的问题，但没有找到。
我确信这是LangChain中的一个bug,而不是我的代码。
通过更新到LangChain的最新稳定版本(或特定集成包)无法解决此bug。

示例代码

from langchain_huggingface import (
    ChatHuggingFace,
    HuggingFacePipeline,
)

chat_llm = ChatHuggingFace(
    llm=HuggingFacePipeline.from_model_id(
        model_id="path/to/your/local/model",  # I downloaded Meta-Llama-3-8B
        task="text-generation",
        device_map="auto",
        model_kwargs={"temperature": 0.0, "local_files_only": True},
    )
)

错误信息和堆栈跟踪(如果适用)

src/resources/predictor.py:55: in load
    self.llm = ChatHuggingFace(
/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/langchain_huggingface/chat_models/huggingface.py:169: in __init__
    self._resolve_model_id()
/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/langchain_huggingface/chat_models/huggingface.py:295: in _resolve_model_id
    available_endpoints = list_inference_endpoints("*")
/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/huggingface_hub/hf_api.py:7081: in list_inference_endpoints
    user = self.whoami(token=token)
/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:114: in _inner_fn
    return fn(*args, **kwargs)
/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/huggingface_hub/hf_api.py:1390: in whoami
    headers=self._build_hf_headers(
/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/huggingface_hub/hf_api.py:8448: in _build_hf_headers
    return build_hf_headers(
/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:114: in _inner_fn
    return fn(*args, **kwargs)
/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_headers.py:124: in build_hf_headers
    token_to_send = get_token_to_send(token)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

token = True

    def get_token_to_send(token: Optional[Union[bool, str]]) -> Optional[str]:
        """Select the token to send from either `token` or the cache."""
        # Case token is explicitly provided
        if isinstance(token, str):
            return token

        # Case token is explicitly forbidden
        if token is False:
            return None

        # Token is not provided: we get it from local cache
        cached_token = get_token()

        # Case token is explicitly required
        if token is True:
            if cached_token is None:
>               raise LocalTokenNotFoundError(
                    "Token is required (`token=True`), but no token found. You"
                    " need to provide a token or be logged in to Hugging Face with"
                    " `huggingface-cli login` or `huggingface_hub.login`. See"
                    " https://huggingface.co/settings/tokens."
                )
E               huggingface_hub.errors.LocalTokenNotFoundError: Token is required (`token=True`), but no token found. You need to provide a token or be logged in to Hugging Face with `huggingface-cli login` or `huggingface_hub.login`. See https://huggingface.co/settings/tokens.

/opt/poetry-cache/virtualenvs/sagacify-example-llm-8EXZSVYp-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_headers.py:158: LocalTokenNotFoundError

描述

我正在尝试使用 langchain-huggingface 库示例化一个 ChatHuggingFace 对象，该对象具有 HuggingFacePipelinellm 参数，该参数针对本地下载的模型(在这里，Meta-Llama-3-8B)。
尽管我没有在环境中设置HuggingFace令牌，因为我使用的是本地模型，但我希望示例化能够正常工作。
然而，示例化失败，因为它试图读取令牌以列出我HuggingFace帐户下可用的端点。
经过调查，我认为这行代码应该位于文件 langchain/libs/partners/huggingface/langchain_huggingface/chat_models/huggingface.py 的第456行，而不是第443行。

def _resolve_model_id(self) -> None:
  """Resolve the model_id from the LLM's inference_server_url"""

  from huggingface_hub import list_inference_endpoints  # type: ignore[import]

  available_endpoints = list_inference_endpoints("*")  # Line 443: This line is not at the right place
  if _is_huggingface_hub(self.llm) or (
      hasattr(self.llm, "repo_id") and self.llm.repo_id
  ):
      self.model_id = self.llm.repo_id
      return
  elif _is_huggingface_textgen_inference(self.llm):
      endpoint_url: Optional[str] = self.llm.inference_server_url
  elif _is_huggingface_pipeline(self.llm):
      self.model_id = self.llm.model_id
      return                      # My code lies in this case where it does not use available endpoints
  else:
      endpoint_url = self.llm.endpoint_url
                                  # Line 456: The line should be here instead
  for endpoint in available_endpoints:
      if endpoint.url == endpoint_url:
          self.model_id = endpoint.repository

  if not self.model_id:
      raise ValueError(
          "Failed to resolve model_id:"
          f"Could not find model id for inference server: {endpoint_url}"
          "Make sure that your Hugging Face token has access to the endpoint."
      )

系统信息

huggingface-hub           0.23.2          Client library to download and publish models, datasets and other repos on the huggingface.co hub
langchain                 0.2.1           Building applications with LLMs through composability
langchain-core            0.2.2           Building applications with LLMs through composability
langchain-huggingface     0.0.3           An integration package connecting Hugging Face and LangChain
langchain-text-splitters  0.2.0           LangChain text splitting utilities
sentence-transformers     3.0.0           Multilingual text embeddings
tokenizers                0.19.1
transformers              4.41.2          State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow

平台：linux
python:Python 3.10.12

我也遇到了同样的问题。我已经下载了Llama3-8b-Instruct模型，但在使用ChatHuggingFace时，它会检查并要求输入token。以下是我如何使用的：

tokenizer = AutoTokenizer.from_pretrained(
    model_path,
    trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    low_cpu_mem_usage=True,
    temperature=0.6,
    device_map="auto",
)
model_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)
llm = HuggingFacePipeline(pipeline=model_pipeline)
model = ChatHuggingFace(llm=llm)

我正在使用langchain-huggingface v0.0.2。从上面的帖子来看，这个问题已经解决了，但我似乎没有得到那个修复。请针对这种情况给出建议。
谢谢！

7条答案

按热度按时间

hjqgdpho1#

@c-pletinckx 已修复！

赞(0）回复(0）举报 6个月前

qyzbxkaa2#

@keenborder786 酷！是否有可能合并这个pull请求并发布一个修复后的langchain-huggingface的新版本？

vulvrdjw3#

相同的问题。我仍然收到这个错误

eufgjt7s4#

tokenizer = AutoTokenizer.from_pretrained(name,use_auth_token=auth_token)

bnb_config = BitsAndBytesConfig(
    load_in_8bit=True,
    
    
)

model_config = AutoConfig.from_pretrained(
    name,
    use_auth_token=auth_token,
    tempreature=0.2,
    top_p=0.95,
    repetition_penalty=1.15,
    max_new_tokens=1200,
)

model = AutoModelForCausalLM.from_pretrained(
    name,
    trust_remote_code=True,
    config=model_config,
    quantization_config=bnb_config,
    device_map='auto',
    use_auth_token=auth_token,
)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=4096, device_map="auto", streamer = streamer)
llm = HuggingFacePipeline(pipeline=pipe)

chat_model = ChatHuggingFace(llm=llm)

46scxncf5#

我认为如果使用本地llm,第320行不应该存在。

def __init__(self, **kwargs: Any):
        super().__init__(**kwargs)

        from transformers import AutoTokenizer  # type: ignore[import]

        self._resolve_model_id()

        self.tokenizer = (
            AutoTokenizer.from_pretrained(self.model_id)
            if self.tokenizer is None
            else self.tokenizer
        )

320 self._resolve_model_id()

lf5gs5x26#

cig3rfwq7#

please modify def _resolve_model_id(self) in langchain_huggingface/chat_models/huggingface.py
like this

we can load and use local model from huggingface

langchain 使用本地模型的HuggingFacePipeline错误地检查可用的推理端点

检查其他资源

示例代码

错误信息和堆栈跟踪(如果适用)

描述

系统信息

7条答案

相关问题

热门标签

最新问答