Hi, I tried to quantize my nobn-float32 ncnn model to int8 model with this command.
./ncnn2table -b=fp32.bin -p=fp32.param -o=int8.table -m=128,128,128 -n=0.01,0.01,0.01 -s=768,768 -t= 6 -i=imgs/
and./ncnn2int8 fp32.param fp32.bin int8.param int8.bin int8.table
It can forward an input with ncnn python binding on x64 and get a result, but in a wrong shape.
Actually the calculation of my int8 model is also wrong and failed to forward on android. There is no way (384x384) x (8 x 1) -> (8 x 1)
After checking the feature map size list, I found out that,almost every layer of int8 model has different out_channels from the float32 and float16, causing the wrong shape of the final output.
For example,
Float32:
Convolution Conv_57 14.39ms | feature_map: 768 x 768 inch: 3 *1 outch: 4 *8 kernel: 3 x 3 stride: 2 x 2
Swish Mul_59 15.23ms | feature_map: 384 x 384 inch: 4 *8 outch: 4 *8
ConvolutionDepthWise Conv_117 16.90ms | feature_map: 384 x 384 inch: 4 *8 outch: 4 *8 kernel: 3 x 3 stride: 1 x 1
Swish Mul_119 15.24ms | feature_map: 384 x 384 inch: 4 *8 outch: 4 *8
Split splitncnn_0 0.00ms |
Pooling GlobalAveragePool_120 11.28ms | feature_map: 384 x 384 inch: 4 *8 outch: 1 *8
InnerProduct Conv_178 4.42ms | feature_map: 4 x 1 inch: 1 *8 outch: 1 *8
Swish Mul_180 0.07ms | feature_map: 1 x 1 inch: 1 *8 outch: 1 *8
Convolution Conv_238 0.01ms | feature_map: 1 x 1 inch: 1 *8 outch: 1 *8 kernel: 1 x 1 stride: 1 x 1
Float16:
Convolution Conv_57 8.51ms | feature_map: 768 x 768 inch: 3 *1 outch: 4 *8 kernel: 3 x 3 stride: 2 x 2
Swish Mul_59 13.39ms | feature_map: 384 x 384 inch: 4 *8 outch: 4 *8
ConvolutionDepthWise Conv_117 24.67ms | feature_map: 384 x 384 inch: 4 *8 outch: 4 *8 kernel: 3 x 3 stride: 1 x 1
Swish Mul_119 17.21ms | feature_map: 384 x 384 inch: 4 *8 outch: 4 *8
Split splitncnn_0 0.00ms |
Pooling GlobalAveragePool_120 3.94ms | feature_map: 384 x 384 inch: 4 *8 outch: 1 *8
InnerProduct Conv_178 0.00ms | feature_map: 4 x 1 inch: 1 *8 outch: 1 *8
Swish Mul_180 0.00ms | feature_map: 1 x 1 inch: 1 *8 outch: 1 *8
Convolution Conv_238 0.01ms | feature_map: 1 x 1 inch: 1 *8 outch: 1 *8 kernel: 1 x 1 stride: 1 x 1
Int8:
Convolution Conv_57 72.56ms | feature_map: 768 x 768 inch: 3 *1 outch: 32 *1 kernel: 3 x 3 stride: 2 x 2
Swish Mul_59 17.91ms | feature_map: 384 x 384 inch: 32 *1 outch: 32 *1
ConvolutionDepthWise Conv_117 23.34ms | feature_map: 384 x 384 inch: 32 *1 outch: 32 *1 kernel: 3 x 3 stride: 1 x 1
Swish Mul_119 13.83ms | feature_map: 384 x 384 inch: 32 *1 outch: 32 *1
Split splitncnn_0 0.00ms |
Pooling GlobalAveragePool_120 5.60ms | feature_map: 384 x 384 inch: 32 *1 outch: 1 *1
InnerProduct Conv_178 3.27ms | feature_map: 32 x 1 inch: 1 *1 outch: 1 *1
Swish Mul_180 0.00ms | feature_map: 8 x 1 inch: 1 *1 outch: 1 *1
Convolution Conv_238 0.01ms | feature_map: 8 x 1 inch: 1 *1 outch: 32 *1 kernel: 1 x 1 stride: 1 x 1
I can provide a sample to reproduce this issue.
https://drive.google.com/file/d/1AH9HqzQG1d1uBGCKVmUVRgFvN71BfOYJ/view?usp=sharing
Hope someone can help me on it. Thanks
3条答案
按热度按时间c90pui9n1#
1-dim and 2-dim shape output format fixed in 4cf4c92
i2loujxw2#
Thanks for the quick fix. But beside these shape formats, the shape of feature map is still wrong.
There is a similar issue.
#1964
It was caused by the wrong conversion of adaptive pooling in pytorch.
For now I fixed this issue by manually modifying the ncnn graph, and hope it can be done automatically some day.
4uqofj5v3#
@zylo117 fix the int8 quantization conversion bug, see #2637