Hi, I tried to quantize my nobn-float32 ncnn model to int8 model with this command.

./ncnn2table -b=fp32.bin -p=fp32.param -o=int8.table -m=128,128,128 -n=0.01,0.01,0.01 -s=768,768 -t= 6 -i=imgs/
and
./ncnn2int8 fp32.param fp32.bin int8.param int8.bin int8.table

It can forward an input with ncnn python binding on x64 and get a result, but in a wrong shape.
Actually the calculation of my int8 model is also wrong and failed to forward on android. There is no way (384x384) x (8 x 1) -> (8 x 1)

After checking the feature map size list, I found out that,almost every layer of int8 model has different out_channels from the float32 and float16, causing the wrong shape of the final output.

For example,

Float32:

Convolution              Conv_57                           14.39ms    |    feature_map:  768 x 768     inch:    3 *1  outch:    4 *8     kernel: 3 x 3     stride: 2 x 2
Swish                    Mul_59                            15.23ms    |    feature_map:  384 x 384     inch:    4 *8  outch:    4 *8
ConvolutionDepthWise     Conv_117                          16.90ms    |    feature_map:  384 x 384     inch:    4 *8  outch:    4 *8     kernel: 3 x 3     stride: 1 x 1
Swish                    Mul_119                           15.24ms    |    feature_map:  384 x 384     inch:    4 *8  outch:    4 *8
Split                    splitncnn_0                        0.00ms    |
Pooling                  GlobalAveragePool_120             11.28ms    |    feature_map:  384 x 384     inch:    4 *8  outch:    1 *8
InnerProduct             Conv_178                           4.42ms    |    feature_map:    4 x 1       inch:    1 *8  outch:    1 *8
Swish                    Mul_180                            0.07ms    |    feature_map:    1 x 1       inch:    1 *8  outch:    1 *8
Convolution              Conv_238                           0.01ms    |    feature_map:    1 x 1       inch:    1 *8  outch:    1 *8     kernel: 1 x 1     stride: 1 x 1

Float16:

Convolution              Conv_57                            8.51ms    |    feature_map:  768 x 768     inch:    3 *1  outch:    4 *8     kernel: 3 x 3     stride: 2 x 2
Swish                    Mul_59                            13.39ms    |    feature_map:  384 x 384     inch:    4 *8  outch:    4 *8
ConvolutionDepthWise     Conv_117                          24.67ms    |    feature_map:  384 x 384     inch:    4 *8  outch:    4 *8     kernel: 3 x 3     stride: 1 x 1
Swish                    Mul_119                           17.21ms    |    feature_map:  384 x 384     inch:    4 *8  outch:    4 *8
Split                    splitncnn_0                        0.00ms    |
Pooling                  GlobalAveragePool_120              3.94ms    |    feature_map:  384 x 384     inch:    4 *8  outch:    1 *8
InnerProduct             Conv_178                           0.00ms    |    feature_map:    4 x 1       inch:    1 *8  outch:    1 *8
Swish                    Mul_180                            0.00ms    |    feature_map:    1 x 1       inch:    1 *8  outch:    1 *8
Convolution              Conv_238                           0.01ms    |    feature_map:    1 x 1       inch:    1 *8  outch:    1 *8     kernel: 1 x 1     stride: 1 x 1

Int8:

Convolution              Conv_57                           72.56ms    |    feature_map:  768 x 768     inch:    3 *1  outch:   32 *1     kernel: 3 x 3     stride: 2 x 2
Swish                    Mul_59                            17.91ms    |    feature_map:  384 x 384     inch:   32 *1  outch:   32 *1
ConvolutionDepthWise     Conv_117                          23.34ms    |    feature_map:  384 x 384     inch:   32 *1  outch:   32 *1     kernel: 3 x 3     stride: 1 x 1
Swish                    Mul_119                           13.83ms    |    feature_map:  384 x 384     inch:   32 *1  outch:   32 *1
Split                    splitncnn_0                        0.00ms    |
Pooling                  GlobalAveragePool_120              5.60ms    |    feature_map:  384 x 384     inch:   32 *1  outch:    1 *1
InnerProduct             Conv_178                           3.27ms    |    feature_map:   32 x 1       inch:    1 *1  outch:    1 *1
Swish                    Mul_180                            0.00ms    |    feature_map:    8 x 1       inch:    1 *1  outch:    1 *1
Convolution              Conv_238                           0.01ms    |    feature_map:    8 x 1       inch:    1 *1  outch:   32 *1     kernel: 1 x 1     stride: 1 x 1

I can provide a sample to reproduce this issue.
https://drive.google.com/file/d/1AH9HqzQG1d1uBGCKVmUVRgFvN71BfOYJ/view?usp=sharing

Hope someone can help me on it. Thanks

3条答案

按热度按时间

c90pui9n1#

1-dim and 2-dim shape output format fixed in 4cf4c92

赞(0）回复(0）举报 2022-10-24

i2loujxw2#

Thanks for the quick fix. But beside these shape formats, the shape of feature map is still wrong.

There is a similar issue.
#1964
It was caused by the wrong conversion of adaptive pooling in pytorch.
For now I fixed this issue by manually modifying the ncnn graph, and hope it can be done automatically some day.

4uqofj5v3#

@zylo117 fix the int8 quantization conversion bug, see #2637

ncnn size of feature map is different after quantization

3条答案

相关问题

热门标签

最新问答