Paddle Where is HW Benchmark ?

ppcbkaq5  于 2022-10-25  发布在  其他

问题描述 Please describe your issue

Inference tests were performed in Desktop and NVIDIA xavier NX.
But i can’t compare inference result, because i don't have Reference information.

HW Specification, Paddle Install Option, Inference result as below.
Is Inference-Result appropriate?

Please tell me average-result.
Thank you.

[ Desktop ]
Ubuntu 18.04
CPU : AMD Ryzen 5 5600G with Radeon Graphics 3.90 GHz
RAM : 32GB
GPU : RTX3060

-Install PP-
conda create -n PPDet python=3.9
conda activate PPDet
conda install paddlepaddle-gpu==2.2.2 cudatoolkit=11.2 -c -c conda-forge

-Install PP-Detection-
conda activate PPDet
cd ~ && mkdir -p PPDet_Git
cd PPDet_Git && git clone
cd PaddleDetection
python3 -m pip install cython
python3 -m pip install cpython
python3 -m pip install numpy
python3 -m pip install -r requirements.txt
python3 install

-Model Export-
python3 tools/ -c /home/k/PPDet_Git/PaddleDetection/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml -o weights=

python3 tools/ -c /home/k/PPDet_Git/PaddleDetection/configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml -o weights=

python3 tools/ -c /home/k/PPDet_Git/PaddleDetection/configs/picodet/picodet_xs_320_coco_lcnet.yml -o weights=

[ Jetson xavier NX ]
Ubuntu 18.04

-Install PP-
cd ~ && git clone
cd Paddle
git checkout release/2.2
sudo mkdir -p build_cuda && cd build_cuda

sudo cmake ..
-DCMAKE_CXX_FLAGS='-Wno-error -w'

-Install PP-Detection-
cd ~ && mkdir -p PPDet_Git
cd PPDet_Git && git clone
cd PPDet_Git && cd PaddleDetection
python3 -m pip install -r requirements.txt
sudo python3 install

-NVIDIA xavier NX mode-
sudo nvpmodel -m 0

-Model Export-
python3 tools/ -c /home/k/PPDet_Git/PaddleDetection/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml -o weights=

python3 tools/ -c /home/k/PPDet_Git/PaddleDetection/configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml -o weights=

python3 tools/ -c /home/k/PPDet_Git/PaddleDetection/configs/picodet/picodet_xs_320_coco_lcnet.yml -o weights=

[ Inference Result ]

[ Model 1, ppyolo_r50vd_dcn_2x_coco]
python3 deploy/python/ --model_dir=./output_inference/ppyolo_r50vd_dcn_2x_coco --image_file=./demo/000000014439_640x640.jpg --device=

  • Desktop : --device=GPU

total_time(ms): 1367.5, img_num: 1
average latency time(ms): 1367.50, QPS: 0.731261
preprocess_time(ms): 898.40, inference_time(ms): 469.10, postprocess_time(ms): 0.00

  • Desktop : --device=CPU

total_time(ms): 3085.9, img_num: 1
average latency time(ms): 3085.90, QPS: 0.324055
preprocess_time(ms): 20.30, inference_time(ms): 3065.60, postprocess_time(ms): 0.00

  • Jetson xavier NX : --device=GPU

total_time(ms): 5659.900000000001, img_num: 1
average latency time(ms): 5659.90, QPS: 0.176682
preprocess_time(ms): 2882.60, inference_time(ms): 2776.70, postprocess_time(ms): 0.60

  • Jetson xavier NX : --device=CPU

total_time(ms): 8196.5, img_num: 1
average latency time(ms): 8196.50, QPS: 0.122003
preprocess_time(ms): 89.00, inference_time(ms): 8107.40, postprocess_time(ms): 0.10

[ Model 2, ppyoloe_crn_s_300e_coco]

  • python3 deploy/python/ --model_dir=./output_inference/ppyoloe_crn_s_300e_coco --image_file=./demo/000000014439_640x640.jpg --device=
  • Desktop : --device=GPU

total_time(ms): 1489.3000000000002, img_num: 1
average latency time(ms): 1489.30, QPS: 0.671456
preprocess_time(ms): 969.20, inference_time(ms): 520.10, postprocess_time(ms): 0.00

  • Desktop : --device=CPU

total_time(ms): 675.2, img_num: 1
average latency time(ms): 675.20, QPS: 1.481043
preprocess_time(ms): 25.80, inference_time(ms): 649.40, postprocess_time(ms): 0.00

  • Jetson xavier NX : --device=GPU

total_time(ms): 4869.799999999999, img_num: 1
average latency time(ms): 4869.80, QPS: 0.205347
preprocess_time(ms): 3322.50, inference_time(ms): 1547.20, postprocess_time(ms): 0.10

  • Jetson xavier NX : --device=CPU

total_time(ms): 65947.8, img_num: 1
average latency time(ms): 65947.80, QPS: 0.015164
preprocess_time(ms): 68.20, inference_time(ms): 65879.40, postprocess_time(ms): 0.20

[ Model 3, picodet_xs_320_coco_lcnet]

  • python3 deploy/python/ --model_dir=./output_inference/picodet_xs_320_coco_lcnet --image_file=./demo/000000014439_640x640.jpg --device=
  • Desktop : --device=GPU

total_time(ms): 1494.8999999999999, img_num: 1
average latency time(ms): 1494.90, QPS: 0.668941
preprocess_time(ms): 959.80, inference_time(ms): 535.10, postprocess_time(ms): 0.00

  • Desktop : --device=CPU

total_time(ms): 84.4, img_num: 1
average latency time(ms): 84.40, QPS: 11.848341
preprocess_time(ms): 11.00, inference_time(ms): 73.40, postprocess_time(ms): 0.00

  • Jetson xavier NX : --device=GPU

total_time(ms): 5147.599999999999, img_num: 1
average latency time(ms): 5147.60, QPS: 0.194265
preprocess_time(ms): 3291.60, inference_time(ms): 1855.90, postprocess_time(ms): 0.10

  • Jetson xavier NX : --device=CPU

total_time(ms): 307.59999999999997, img_num: 1
average latency time(ms): 307.60, QPS: 3.250975
preprocess_time(ms): 29.20, inference_time(ms): 278.30, postprocess_time(ms): 0.10



Thanks. Do you include the warmup time in your benchmarks? Usually, the GPU warmup time is much higher than the CPU. Furthermore, both the GPU/CPU preprocesses run on the CPU, so it's unreasonable that the GPU preprocess time is too much higher than the CPU。



@liyancas Thank you for answer.
From your answer, i understand preprocess of GPU warmup.

So now, my question is below.
[1] 'Inference_time(ms)' is reasonable ? with my HW Specification and '--device=CPU' option.
[2] Should i ask benchmark data with similar my HW Specification ?

I can't find Paddle HW benchmark page.
Thank you.
