ollama run renxin_query_type_classify "hello" ![](//img.saoniuhuo.com/images/202407/72501721624584467.jpg) 我成功地从gguf文件生成了一个模型。然而,它一直在加载,但从未成功。有什么想法发生了什么? FROM ./model_query_type_classify.gguf PARAMETER temperature 0 SYSTEM """ 对用户类型进行分类 """
vpfxa7rd1#
从日志中可以看出,没有发现任何错误记录。2024-03-01 01:30:28.171 [INFO] [gpu.go:146] 检测到CUDA计算能力:8.62024-03-01 01:30:28.171 [INFO] [cpu_common.go:11] CPU支持AVX2正在加载库 /tmp/ollama3442133364/cuda_v11/libext_server.so2024-03-01 01:30:28.171 [INFO] [dyn_ext_server.go:90] 正在加载动态llm服务器: /tmp/ollama3442133364/cuda_v11/libext_server.so2024-03-01 01:30:28.171 [INFO] [dyn_ext_server.go:150] 初始化llama服务器llama_model_loader: 从 /root/.ollama/models/blobs/sha256:456402914e838a953e0cf80caa6adbe75383d9e63584a964f504a7bbb8f7aad9 (版本 GGUF V3 (最新)) 加载了24个键值对和254个Tensorllama_model_loader: 导出元数据键值对。注意:KV覆盖不适用于此输出。llama_model_loader: - kv 0: general.architecture str = gemmallama_model_loader: - kv 1: general.name str = gemma-7b-itllama_model_loader: - kv 2: gemma.context_length u32 = 8192llama_model_loader: - kv 3: gemma.embedding_length u32 = 3072llama_model_loader: - kv 4: gemma.block_count u32 = 28llama_model_loader: - kv 5: gemma.feed_forward_length u32 = 24576llama_model_loader: - kv 6: gemma.attention.head_count u32 = 16llama_model_loader: - kv 7: gemma.attention.head_count_kv u32 = 16llama_model_loader: - kv 8: gemma.attention.layer_norm_rms_epsilon f32 = 0.000001llama_model_loader: - kv 9: gemma.attention.key_length u32 = 256llama_model_loader: - kv 10: gemma.attention.value_length u32 = 256llama_model_loader: - kv 11: tokenizer.ggml.model str = llamallama_model_loader: - kv 12: tokenizer.ggml.tokens arr[str,256000] = ["", "", "", "", ...llama_model_loader: - kv 13: tokenizer.ggml.scores arr[f32,256000] = [0.000000, 0.000000, 0.000000, ...
watbbzwu2#
顺便说一下,这是我的模型文件:
FROM ./model_query_type_classify.gguf PARAMETER temperature 0 SYSTEM """ 对用户输入进行分类 """
nimxete23#
你好,抱歉发生了这种情况。是否可以检查日志以查找错误?
3条答案
按热度按时间vpfxa7rd1#
从日志中可以看出,没有发现任何错误记录。
2024-03-01 01:30:28.171 [INFO] [gpu.go:146] 检测到CUDA计算能力:8.6
2024-03-01 01:30:28.171 [INFO] [cpu_common.go:11] CPU支持AVX2
正在加载库 /tmp/ollama3442133364/cuda_v11/libext_server.so
2024-03-01 01:30:28.171 [INFO] [dyn_ext_server.go:90] 正在加载动态llm服务器: /tmp/ollama3442133364/cuda_v11/libext_server.so
2024-03-01 01:30:28.171 [INFO] [dyn_ext_server.go:150] 初始化llama服务器
llama_model_loader: 从 /root/.ollama/models/blobs/sha256:456402914e838a953e0cf80caa6adbe75383d9e63584a964f504a7bbb8f7aad9 (版本 GGUF V3 (最新)) 加载了24个键值对和254个Tensor
llama_model_loader: 导出元数据键值对。注意:KV覆盖不适用于此输出。
llama_model_loader: - kv 0: general.architecture str = gemma
llama_model_loader: - kv 1: general.name str = gemma-7b-it
llama_model_loader: - kv 2: gemma.context_length u32 = 8192
llama_model_loader: - kv 3: gemma.embedding_length u32 = 3072
llama_model_loader: - kv 4: gemma.block_count u32 = 28
llama_model_loader: - kv 5: gemma.feed_forward_length u32 = 24576
llama_model_loader: - kv 6: gemma.attention.head_count u32 = 16
llama_model_loader: - kv 7: gemma.attention.head_count_kv u32 = 16
llama_model_loader: - kv 8: gemma.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 9: gemma.attention.key_length u32 = 256
llama_model_loader: - kv 10: gemma.attention.value_length u32 = 256
llama_model_loader: - kv 11: tokenizer.ggml.model str = llama
llama_model_loader: - kv 12: tokenizer.ggml.tokens arr[str,256000] = ["", "", "", "", ...
llama_model_loader: - kv 13: tokenizer.ggml.scores arr[f32,256000] = [0.000000, 0.000000, 0.000000, ...
watbbzwu2#
顺便说一下,这是我的模型文件:
nimxete23#
你好,抱歉发生了这种情况。是否可以检查日志以查找错误?