bug描述 Describe the Bug
历史issue( PaddlePaddle/PaddleDetection#5073 ) 有提及到此类问题,但是经检查发现问题不是同一类具体请看报错内容如下:
paddle>>> paddle.utils.run_check()
Running verify PaddlePaddle program ...
I0724 16:12:45.847908 2871941 interpretercore.cc:237] New Executor is Running.
W0724 16:12:45.848285 2871941 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 12.0
W0724 16:12:45.849236 2871941 gpu_resources.cc:149] device: 0, cuDNN Version: 8.9.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/utils/install_check.py", line 249, in run_check
_run_static_single(use_cuda, use_xpu)
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/utils/install_check.py", line 147, in _run_static_single
exe.run(
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/fluid/executor.py", line 1392, in run
res = self._run_impl(
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/fluid/executor.py", line 1618, in _run_impl
ret = new_exe.run(
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/fluid/executor.py", line 654, in run
tensors = self._new_exe.run(
OSError: In user code:
File "<stdin>", line 1, in <module>
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/utils/install_check.py", line 249, in run_check
_run_static_single(use_cuda, use_xpu)
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/utils/install_check.py", line 133, in _run_static_single
input, out, weight = _simple_network()
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/utils/install_check.py", line 37, in _simple_network
linear_out = paddle.nn.functional.linear(x=input, weight=weight, bias=bias)
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/nn/functional/common.py", line 1860, in linear
helper.append_op(
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/fluid/layer_helper.py", line 45, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/fluid/framework.py", line 4013, in append_op
op = Operator(
File "/mnt/datadisk0/data/miniconda/py3.7/lib/python3.8/site-packages/paddle/fluid/framework.py", line 2781, in __init__
for frame in traceback.extract_stack():
ExternalError: CUBLAS error(7).
[Hint: 'CUBLAS_STATUS_INVALID_VALUE'. An unsupported value or parameter was passed to the function (a negative vector size, for example). To correct: ensure that all the parameters being passed have valid values. ] (at ../paddle/phi/backends/gpu/gpu_context.cc:598)
[operator < matmul_v2 > error]
如上报错内容“Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 12.0”版本一致,希望有遇到类似问题的小伙伴解答一下。
其他补充信息 Additional Supplementary Information
cuda:12.0
cudnn:8.9.2.26
python:3.8.17
paddlepaddle-gpu:2.5.0.post120
Ubuntu:20.04.6 LTS
gcc:(conda-forge gcc 12.1.0-17) 12.1.0
7条答案
按热度按时间wrrgggsh1#
I got the error message below when running paddle-ocr with GPU. (w/o GPU, it's ok.)
Runtime Env
d8tt03nd2#
提供一个解决思路。
有可能是本地安装的cuda和cudnn的问题。我的解决方案是注释掉从环境变量中删去本地已安装的cuda和cudnn包括PATH,CUDA_PATH和LD_LIBRARY_PATH,安装paddlepaddle的cudnn版本,官方文档中有给出,如下:
k0pti3hp3#
@MaddingRookie 你好,你本地环境是CUDA12.0版本的,这个最近刚编出来,可能适配的不是很好,建议重装一下2.5.1.post120试试。
dzjeubhm4#
@MaddingRookie 另外显卡是什么版本的,如果是计算能力太低的显卡,CUDA 12.0是有可能已经不支持了,需要用一下低版本的CUDA
6za6bjd05#
@MaddingRookie 你好,你本地环境是CUDA12.0版本的,这个最近刚编出来,可能适配的不是很好,建议重装一下2.5.1.post120试试。
你好,我目前把cuda降级成11.7了,相应的包也降级了,重新操作了一遍没问题了,应该是适配的问题
q43xntqr6#
@MaddingRookie 另外显卡是什么版本的,如果是计算能力太低的显卡,CUDA 12.0是有可能已经不支持了,需要用一下低版本的CUDA
我之前的显卡驱动V100 64G Driver Version: 525.125.06
t2a7ltrp7#
@MaddingRookie 另外显卡是什么版本的,如果是计算能力太低的显卡,CUDA 12.0是有可能已经不支持了,需要用一下低版本的CUDA
您好,我也遇到了这个问题,
paddle.utils.run_check()
一样的错误。W1011 22:26:24.336457 638519 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 11.7 W1011 22:26:24.338999 638519 gpu_resources.cc:149] device: 0, cuDNN Version: 8.5.
显卡是A6000 Driver Version: 525.89.02
您知道怎么解决吗