``` 错误信息:ollama run glm4 - CUBLAS_STATUS_NOT_INITIALIZED ```

mccptt67  于 2个月前  发布在  其他
关注(0)|答案(5)|浏览(36)

问题是什么?
[root@hanadev system]# ollama run glm4
错误:llama运行器进程已终止:信号:中止(核心转储)CUDA错误:CUBLAS_STATUS_NOT_INITIALIZED
当前设备:0,在函数cublas_handle中/go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda/common.cuh:826
cublasCreate_v2(&cublas_handles[device])
GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda.cuemoji symbols:100 !"CUDA error"
NVIDIA-SMI 465.19.01驱动程序版本:465.19.01 CUDA版本:11.3

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama版本

0.21

5tmbdcev

5tmbdcev1#

Ollama docker镜像v1.4.7正常工作。
GPU:Tesla V100-PCIE-32GB
Nvidia Toolkit:V12.5
通过最新的docker镜像运行Ollama。在这里也遇到了类似的或相同的问题:

/go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda/template-instances/../mmq.cuh:2422: ERROR: CUDA kernel mul_mat_q has no device code compatible with CUDA arch 700. ggml-cuda.cu was compCUDA error: unspecified launch failure
  current device: 0, in function ggml_cuda_op_mul_mat at /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda.cu:1606
  cudaGetLastError()
GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda.cu:100: !"CUDA error"
iled for: __CUDA_ARCH_LIST__
hi3rlvi2

hi3rlvi22#

最近我遇到了相同的错误
ollama版本是0.2.7
当我尝试运行deepseek-v2:236b时,它只在核心转储时发生错误。
错误:llama运行器进程已终止:信号:中止(核心转储)CUDA错误:CUBLAS_STATUS_NOT_INITIALIZED
当前设备:0,在cublas_handle函数中/go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda/common.cuh:826
cublasCreate_v2(&cublas_handles[device])
GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda.cuemoji symbols:100 !"CUDA error"
如果我尝试从同一个供应商运行一个较小的模型,没有问题
运行deepseek-v2:16b
操作系统:Ubuntu 22.04 LTS
GPU:Nvidia

# nvidia-smi
Sun Jul 21 07:39:28 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        Off |   00000000:04:00.0 Off |                  N/A |
|  0%   42C    P8             19W /  350W |      13MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 3090        Off |   00000000:84:00.0 Off |                  N/A |
|  0%   38C    P8             22W /  350W |      13MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2276      G   /usr/lib/xorg/Xorg                              4MiB |
|    1   N/A  N/A      2276      G   /usr/lib/xorg/Xorg                              4MiB |
+-----------------------------------------------------------------------------------------+
d5vmydt9

d5vmydt93#

它在几周前还能正常工作。
ollama run deepseek-v2:236b

okxuctiv

okxuctiv4#

@SunMacArenas,您能分享更多关于您的设置的信息吗?我无法复现故障,对我来说,glm4在0.2.1和最新的0.2.8版本中都能正确加载。您的显存有多少?您能分享一下服务器日志吗?

@harrytong,您能分享一下在旧版本上运行正常的系统上的ollama ps输出以及模型加载时的nvidia-smi输出吗?您的系统内存有多少?如果您能分享一下旧版本运行正常时的服务器日志以及新版本加载失败时的日志,这也可能有助于了解发生了什么问题。

sc4hvdpw

sc4hvdpw5#

你好,Daniel。很遗憾,我无法恢复我的旧配置。我不知道是不是CUDA 12.5.1更新,以及/或者Nvidia 555驱动程序导致的。现在我唯一能运行ollama run deepseek-v2:236b的方法是拔掉我的两块GTX 3090显卡,让我的双XEON 72核心进行推理(比当我的两块RTX 3090显卡可以参与时要慢得多)。我有一台双XEON CPU,配备256GB RAM和两块RTX3090显卡(总共48GB GPU RAM)。这是我当前的nvidia-smi输出:

Tue Jul 23 20:38:10 2024       +-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.06              Driver Version: 555.42.06      CUDA Version: 12.5     ||-----------------------------------------+
------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC || Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. ||                                         |                        |               MIG M. ||=========================================+========================+======================||
|                                                                 |                        |               MIG M. ||
|==================================+======================+===================||
|   0  NVIDIA GeForce RTX 3090        Off |   00000000:04:00.0 Off |                  N/A ||  0%   31C    P8              8W /  350W |      18MiB /  24576MiB |      0%      Default ||                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|                                                                 |                        |               MIG M. ||
|==============================+======================+===================||
|                                                                 |                        |               MIG M. ||
|======================+===================+===============||
|                                                                 |                        |               MIG M. ||
|================+================+===============||
|                                                                 |                        |               MIG M. ||
|============+================+===============||
|                                                                 |                        |               MIG M. ||
|====\u0026gt; \u0026lt;--------------------------------------------------------------------------------------------------+
| Processes:                                                                              ||
| GPU  GI  CI        PID  Type     Process name                              GPU Memory || ID ID                                                               Usage      ||
|====u0026gt; u0026lt;--------------------------------------------------------------------------------------------------+
|    0 N/A N/A      2585 G    /usr/lib/xorg/Xorg                         4MiB ||    1 N/A N/A      2585 G    /usr/lib/xorg/Xorg                         4MiB ||
\u0026lt;\u0026lt;-----------------Volatile information such as cached device mounts is lost on reboot.----------------------------------------------------------------------------> \u0026lt;\u0026lt;

相关问题