Paddle 关于口罩检测模型GPU推理速度问题:2080ti的卡比1660ti的卡要慢接近一倍的时间

1qczuiv0  于 2021-11-30  发布在  Java
关注(0)|答案(5)|浏览(331)

你好,我的paddlepaddle版本为1.8.2 paddlehub版本为1.7.1,python版本3.6,再运用口罩检测模型server版的时候,在1660ti上usegpu=True, 检测时间为30ms,在2080ti上usegpu=True,检测时间反而变慢了 ,需要 70-80ms,输入图片大小为1280*720,检测算法里面的参数设定都是一样的。cuda版本均为10.0,cudnn均为7.6.1,排查不出原因,只能麻烦您们了/

Thank you for contributing to PaddlePaddle.
Before submitting the issue, you could search issue in the github.Probably there was a similar issue submitted or resolved before.
If there is no solution,please make sure that this is a issue of models including the following details:

System information

-PaddlePaddle version (eg.1.1)or CommitID
-CPU: including CPUMKL/OpenBlas/MKLDNN version
-GPU: including CUDA/CUDNN version
-OS Platform (eg.Mac OS 10.14)
-Python version
-Name of Models&Dataset/details of operator
Note: You can get most of the information by running summary_env.py.

To Reproduce

Steps to reproduce the behavior

Describe your current behavior
Code to reproduce the issue
Other info / logs

mbzjlibv

mbzjlibv1#

补充一下,我2080ti是四块,我观看了咱们的使用文档,说只支持单卡,所以我把另外三块都禁用了。而且如果没用GPU的话检测人脸时间为300ms左右,所以我肯定是调用了GPU进行了人脸检测,就是不知道为什么2080ti的卡比1660ti的卡要慢接近一倍的时间/

0md85ypi

0md85ypi2#

预测的代码可以给下么

8dtrkrch

8dtrkrch3#

@NHZlX 好的
#设置cuda设备为0
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

1. paddle 人脸检测器

mask_detector = hub.Module(name="pyramidbox_lite_server_mask")
img_rd = stack.pop()
flg.value = False
img_list.append(img_rd)
x = time.time()
result = mask_detector.face_detection(img_list,use_gpu=True,confs_threshold=0.9)

这是我的预测代码,经过我最新的测试,目前我重新安装了2080ti的显卡驱动,因为有4块 2080ti的显卡,所以我禁用了三块。输入nvidia-smi 只显示一个GPU信息 ,信息如下:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 446.14 Driver Version: 446.14 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... WDDM | 00000000:82:00.0 Off | N/A |
| 27% 32C P8 21W / 250W | 271MiB / 11264MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU PID Type Process name GPU Memory |
| Usage |
|=============================================================================|
| 0 2008 C+G Insufficient Permissions N/A |
| 0 9456 C+G ...w5n1h2txyewy\SearchUI.exe N/A |
| 0 10400 C+G ...3d8bbwe\MicrosoftEdge.exe N/A |
| 0 10664 C+G ...es.TextInput.InputApp.exe N/A |
| 0 13696 C+G ...y\ShellExperienceHost.exe N/A |
| 0 17128 C+G ...lPanel\SystemSettings.exe N/A |
+-----------------------------------------------------------------------------+

然后目前输入nvcc -v查看cuda版本的话 ,因为我重装GPU显卡后没有重装cuda10.0显示如下:
nvcc fatal : No input files specified; use option --help for more information

这是我运行程序时,我的nvidia-smi的变化情况:
| NVIDIA-SMI 446.14 Driver Version: 446.14 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... WDDM | 00000000:82:00.0 Off | N/A |
| 27% 44C P2 66W / 250W | 1931MiB / 11264MiB | 4% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU PID Type Process name GPU Memory |
| Usage |
|=============================================================================|
| 0 2008 C+G Insufficient Permissions N/A |
| 0 9092 C ...ython\Python36\python.exe N/A |
| 0 9456 C+G ...w5n1h2txyewy\SearchUI.exe N/A |
| 0 10400 C+G ...3d8bbwe\MicrosoftEdge.exe N/A |
| 0 10664 C+G ...es.TextInput.InputApp.exe N/A |
| 0 10884 C ...ython\Python36\python.exe N/A |
| 0 13328 C ...ython\Python36\python.exe N/A |
| 0 13696 C+G ...y\ShellExperienceHost.exe N/A |
| 0 17128 C+G ...lPanel\SystemSettings.exe N/A |
| 0 17532 C ...ython\Python36\python.exe N/A |
其中python36表示我真的用到了GPU,但是人脸检测时间: 0.08497023582458496,我用1660ti跑的话只需要0.036即可完成。
问题1:目前我认为我2080ti的电脑没有把cuda10.0的环境变量加入进去,因为nvcc -v的输出结果不对,但是程序确实调用了我的2080ti这块显卡。我不明白为什么
问题2:我昨天测试环境与今天测试环境一致,除了重装了显卡驱动之外没有任何操作,昨天输入nvcc-v是显示我的cuda版本为10.0,今天输入nvcc-v是没有输出的,但是人脸检测时间都是在60~80ms,可是我的1660ti的笔记本检测时间能在30ms左右。实在不明白为什么,麻烦帮我想想办法。。我不知道是我哪步没做对导致这样的结果,或者说1660ti和2080ti这两块显卡在调用的时候会有区别吗?

roqulrg3

roqulrg34#

hub show pyramidbox_lite_server_mask
hub show pyramidbox_lite_server

通过这个看下hub的版本是哪个,如果不是1.30,请更新下

up9lanfz

up9lanfz5#

@NHZlX 我在运行hub show pyramidbox_lite_server_mask 包括代码段的时候会报这个提示,我不确定是因为这个会影响我的速度吗
提示信息为:2020-06-22 17:20:36,918-INFO: Instantiated empty configuration.
HDFS initialization failed, please check if .hdfscli,cfg exists.
show pyramidbox_lite_server_mask 我这个版本信息为1.3.0,
show pyramidbox_lite_server这个版本信息为1.2.0

相关问题