Langchain-Chatchat 在提问过程中出现错误 An error occurred during streaming

2024-07-22 01:00:58.319 | INFO | chatchat.startup:run_api_server:55 - Api MODEL_PLATFORMS: [PlatformConfig(platform_name='xinference', platform_type='xinference', api_base_url='http://127.0.0.1:9997/v1', api_key='EMPTY', api_proxy='', api_concurrencies=5, auto_detect_model=True, llm_models=['chatglm3_6b'], embed_models=['bge_large_zh'], text2image_models=[], image2text_models=[], rerank_models=[], speech2text_models=[], text2speech_models=[]), PlatformConfig(platform_name='ollama', platform_type='ollama', api_base_url='http://127.0.0.1:11434/v1', api_key='EMPTY', api_proxy='', api_concurrencies=5, auto_detect_model=False, llm_models=['qwen:7b', 'qwen2:7b'], embed_models=['quentinz/bge-large-zh-v1.5'], text2image_models=[], image2text_models=[], rerank_models=[], speech2text_models=[], text2speech_models=[]), PlatformConfig(platform_name='oneapi', platform_type='oneapi', api_base_url='http://127.0.0.1:3000/v1', api_key='sk-', api_proxy='', api_concurrencies=5, auto_detect_model=False, llm_models=['chatglm_pro', 'chatglm_turbo', 'chatglm_std', 'chatglm_lite', 'qwen-turbo', 'qwen-plus', 'qwen-max', 'qwen-max-longcontext', 'ERNIE-Bot', 'ERNIE-Bot-turbo', 'ERNIE-Bot-4', 'SparkDesk'], embed_models=['text-embedding-v1', 'Embedding-V1'], text2image_models=[], image2text_models=[], rerank_models=[], speech2text_models=[], text2speech_models=[]), PlatformConfig(platform_name='openai', platform_type='openai', api_base_url='https://api.openai.com/v1', api_key='sk-proj-', api_proxy='', api_concurrencies=5, auto_detect_model=False, llm_models=['gpt-4o', 'gpt-3.5-turbo'], embed_models=['text-embedding-3-small', 'text-embedding-3-large'], text2image_models=[], image2text_models=[], rerank_models=[], speech2text_models=[], text2speech_models=[])]
2024-07-22 01:01:54.689 | ERROR | chatchat.server.api_server.openai_routes:generator:105 - openai request error: An error occurred during streaming
2024-07-22 00:57:52,416 transformers.configuration_utils 242110 INFO loading configuration file /home/lrn/ALL/Langchain-Chatchat/models/chatglm3-6b/config.json
2024-07-22 00:57:52,417 transformers.configuration_utils 242110 INFO Model config ChatGLMConfig {
"_name_or_path": "/home/lrn/ALL/Langchain-Chatchat/models/chatglm3-6b",
"add_bias_linear": false,
"add_qkv_bias": true,
"apply_query_key_layer_scaling": true,
"apply_residual_connection_post_layernorm": false,
"architectures": [
"ChatGLMModel"
],
"attention_dropout": 0.0,
"attention_softmax_in_fp32": true,
"auto_map": {
"AutoConfig": "configuration_chatglm.ChatGLMConfig",
"AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
},
"bias_dropout_fusion": true,
"classifier_dropout": null,
"eos_token_id": 2,
"ffn_hidden_size": 13696,
"fp32_residual_connection": false,
"hidden_dropout": 0.0,
"hidden_size": 4096,
"kv_channels": 128,
"layernorm_epsilon": 1e-05,
"model_type": "chatglm",
"multi_query_attention": true,
"multi_query_group_num": 2,
"num_attention_heads": 32,
"num_layers": 28,
"original_rope": true,
"pad_token_id": 0,
"padded_vocab_size": 65024,
"post_layer_norm": true,
"pre_seq_len": null,
"prefix_projection": false,
"quantization_bit": 0,
"rmsnorm": true,
"seq_length": 8192,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.42.4",
"use_cache": true,
"vocab_size": 65024
}
2024-07-22 00:58:16,916 transformers.modeling_utils 242537 INFO All the weights of BertModel were initialized from the model checkpoint at /home/lrn/ALL/Langchain-Chatchat/models/bge-large-zh. If your task is similar to the task the model of the checkpoint was trained on, you can already use BertModel for predictions without further training.
2024-07-22 00:58:16,916 transformers.dynamic_module_utils 242537 INFO Patched resolved_trust_remote_code: (False, '/home/lrn/ALL/Langchain-Chatchat/models/bge-large-zh', True, False) {}
2024-07-22 00:58:16,916 transformers.tokenization_utils_base 242537 INFO loading file vocab.txt
2024-07-22 00:58:16,916 transformers.tokenization_utils_base 242537 INFO loading file tokenizer.json
2024-07-22 00:58:16,916 transformers.tokenization_utils_base 242537 INFO loading file added_tokens.json
2024-07-22 00:58:16,916 transformers.tokenization_utils_base 242537 INFO loading file special_tokens_map.json
2024-07-22 00:58:16,916 transformers.tokenization_utils_base 242537 INFO loading file tokenizer_config.json
--- Logging error ---
Traceback (most recent call last):
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/logging/handlers.py", line 73, in emit
if self.shouldRollover(record):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/logging/handlers.py", line 196, in shouldRollover
msg = "%s\n" % self.format(record)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/logging/init.py", line 953, in format
return fmt.format(record)
^^^^^^^^^^^^^^^^^^^^^^^^^^{
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/logging/init.py", line 687, in format
record.message = record.getMessage()
TypeError: not all arguments converted during string formatting
Call stack:
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/threading.py", line 995, in _bootstrap
self._inner_bootstrap()
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/concurrent/futures/thread.py", line 83, in _work_guard
work_item.run()
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/site-packages/xoscar/api.py", line 402, in _wrapper
return next(_gen)
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/model.py", line 318, in _to_json_generator
for v in gen:
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/utils.py, in _to_chat_completion_chunks
for i, chunk in enumerate(chunks):
File "/home/lrn/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/model//llm//pytorch//chatglm.py", line 259, in _stream_generator
for chunk_text, _ in self.model.stream_chat(self.input_ids, past=past_key_values)
File "/home/lrn
这个错误是由于在调用GenerationMixin._get_logits_warper()方法时缺少了一个必需的位置参数device。为了解决这个问题，你需要在调用该方法时传入device参数。

解决方案：
在调用GenerationMixin._get_logits_warper()方法时，传入device参数。例如：

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
logits_warper = self._get_logits_warper(generation_config, device)

Langchain-Chatchat 在提问过程中出现错误 An error occurred during streaming

问题描述

复现问题的步骤

预期的结果

实际结果

环境信息

附加信息

5条答案

相关问题

热门标签

最新问答