发生了一个错误,ggml_cuda_compute_forward函数中的RMS_NORM计算失败,导致CUDA错误。当前设备为0,发生在ggml/src/ggml-cuda.cu文件的第2288行。错误信息如下:
GGML_ASSERT: ggml/src/ggml-cuda.cu:101: !"CUDA error"
[New LWP 252]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007dc7bf87142f in __GI___wait4 (pid=255, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
# 0 0x00007dc7bf87142f in __GI___wait4 (pid=255, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30 in ../sysdeps/unix/sysv/linux/wait4.c
[#1](https://github.com/ggerganov/llama.cpp/issues/1) 0x0000647041457f0b in ggml_print_backtrace ()
[#2](https://github.com/ggerganov/llama.cpp/issues/2) 0x000064704132bb47 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) ()
[#3](https://github.com/ggerganov/llama.cpp/pull/3) 0x00006470413300ea in ggml_backend_cuda_graph_compute(ggml_backend*, ggml_cgraph*) ()
[#4](https://github.com/ggerganov/llama.cpp/issues/4) 0x00006470414a41d6 in ggml_backend_sched_graph_compute_async ()
[#5](https://github.com/ggerganov/llama.cpp/issues/5) 0x00006470414fdd7a in llama_decode ()
[#6](https://github.com/ggerganov/llama.cpp/pull/6) 0x00006470415ca265 in llama_init_from_gpt_params(gpt_params&) ()
[#7](https://github.com/ggerganov/llama.cpp/issues/7) 0x000064704131315e in main ()
[Inferior 1 (process 251) detached]
4条答案
按热度按时间nfeuvbwi1#
相同,但使用RX 7600 XT (gfx1102)
unguejic2#
我也在使用最新从源代码编译的Ollama构建(97c20ed)时遇到了同样的问题,它在我的RX 6700 XT上运行在Ubuntu Server 22.04上。我以为像其他地方指出的那样设置覆盖和目标会解决这个问题,但对我来说并没有。
以root身份运行:
输出:
rocaminfo
输出:4c8rllxm3#
@m828 I know you're not using Ollama but I hope this helps somehow. This was the missing piece for me: ollama/ollama#3107 (comment)
Except instead of
gen_windows.ps1
because I'm on Linux I changedgen_linux.sh
. Once I built Ollama, I ran:mw3dktmi4#
相同的,但使用RX 7600 XT (gfx1102)
看起来我不小心从提供的构建命令中复制了gfx1030,而它应该是为我准备的gfx1102。更改后解决了问题。