发生了什么?
我正尝试像往常一样加载phi-3迷你七月更新模型,但它给我以下错误:
llama_model_load: error loading model: error loading model hyperparameters: key not found in model: phi3.attention.sliding_window
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '.\models\me\phi-3-mini-4k-it-July-5\Phi-3.1-mini-4k-instruct-Q8_0_L.gguf'
main: error: unable to load model
此外,phi-2和phi-3原始模型仍然可以正常工作!值得注意的是,我也下载了最新版本的LM Studio,但它也无法运行这个相同的模型,抛出相同的错误。
名称和版本
PS F:\ai3> .\llama.cpp\build\bin\Release\llama-cli.exe --version
版本: 3505 ( b72c20b )
使用MSVC 19.40.33811.0为x64构建
您正在看到问题的操作系统是什么?
Windows
相关日志输出
PS F:\ai3> .\llama.cpp\build\bin\Release\llama-cli.exe -m .\models\me\phi-3-mini-4k-it-July-5\Phi-3.1-mini-4k-instruct-Q8_0_L.gguf -if -p "hello"
Log start
main: build = 3505 (b72c20b8)
main: built with MSVC 19.40.33811.0 for x64
main: seed = 1722688170
llama_model_loader: loaded meta data with 30 key-value pairs and 195 tensors from .\models\me\phi-3-mini-4k-it-July-5\Phi-3.1-mini-4k-instruct-Q8_0_L.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = phi3
llama_model_loader: - kv 1: general.name str = Phi3
llama_model_loader: - kv 2: phi3.context_length u32 = 4096
llama_model_loader: - kv 3: phi3.rope.scaling.original_context_length u32 = 4096
llama_model_loader: - kv 4: phi3.embedding_length u32 = 3072
llama_model_loader: - kv 5: phi3.feed_forward_length u32 = 8192
llama_model_loader: - kv 6: phi3.block_count u32 = 32
llama_model_loader: - kv 7: phi3.attention.head_count u32 = 32
llama_model_loader: - kv 8: phi3.attention.head_count_kv u32 = 32
llama_model_loader: - kv 9: phi3.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 10: phi3.rope.dimension_count u32 = 96
llama_model_loader: - kv 11: phi3.rope.freq_base f32 = 10000.000000
llama_model_loader: - kv 12: general.file_type u32 = 7
llama_model_loader: - kv 13: tokenizer.ggml.model str = llama
llama_model_loader: - kv 14: tokenizer.ggml.pre str = default
llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,32064] = ["<unk>", "<s>", "</s>", "<0x00>", "<...
llama_model_loader: - kv 16: tokenizer.ggml.scores arr[f32,32064] = [-1000.000000, -1000.000000, -1000.00...
llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,32064] = [3, 3, 4, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 1
llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 32000
llama_model_loader: - kv 20: tokenizer.ggml.unknown_token_id u32 = 0
llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 32000
llama_model_loader: - kv 22: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 23: tokenizer.ggml.add_eos_token bool = false
llama_model_loader: - kv 24: tokenizer.chat_template str = {% for message in messages %}{% if me...
llama_model_loader: - kv 25: general.quantization_version u32 = 2
llama_model_loader: - kv 26: quantize.imatrix.file str = /models/Phi-3.1-mini-4k-instruct-GGUF...
llama_model_loader: - kv 27: quantize.imatrix.dataset str = /training_data/calibration_datav3.txt
llama_model_loader: - kv 28: quantize.imatrix.entries_count i32 = 128
llama_model_loader: - kv 29: quantize.imatrix.chunks_count i32 = 151
llama_model_loader: - type f32: 65 tensors
llama_model_loader: - type f16: 2 tensors
llama_model_loader: - type q8_0: 128 tensors
llama_model_load: error loading model: error loading model hyperparameters: key not found in model: phi3.attention.sliding_window
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '.\models\me\phi-3-mini-4k-it-July-5\Phi-3.1-mini-4k-instruct-Q8_0_L.gguf'
main: error: unable to load model
PS F:\ai3> .\llama.cpp\build\bin\Release\llama-cli.exe --version
version: 3505 (b72c20b8)
built with MSVC 19.40.33811.0 for x64
PS F:\ai3>
1条答案
按热度按时间tmb3ates1#
可能与 #8627 有关。