Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Malhar
Employee
151 Views

INT8 Inference Demo Example

Hi I am looking for int8 demo example for inference. I tried quantizing the resnet50 model from FP32 to int8 using calibration tool. By checking the size of model files (.bin), I see both file size as same. Can anybody share an example where he/she has attempted to convert FP32 model to INT8 and then Run it via Inference Engine ?

0 Kudos
4 Replies
Shubha_R_Intel
Employee
151 Views

Hi Malhar. Did you make an attempt with one of the samples under inference_engine/samples with your newly INT8 calibrated model ?

Yuanyuan_L_Intel
Employee
151 Views

Hi,  Malhar   The calibration tool does calibration for activation. The weights are not saved into .bin and .xml.  Instead, when the quantized model loadded by inference engine, the fp32 weights will be normalized into int8. so, the converted .bin is the same size of fp32 bin. 

Shubha_R_Intel
Employee
151 Views

Hi Malhar, Yuanyuan answered your question but this online document has all the details:

https://docs.openvinotoolkit.org/R5/_docs_IE_DG_Int8Inference.html

Thanks !

Shubha

Malhar
Employee
151 Views

Hi Shubha R,

I am running  "classification_sample" example and by providing the -pc i get the list of layers executed with JIT. 

 

adding the details from performance counts --

 

conv2d_1/Conv2D               EXECUTED       layerType: Convolution        realTime: 272        cpu: 272            execType: jit_avx512_I8
conv2d_c1_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 15         cpu: 15             execType: jit_avx512_1x1_I8
conv2d_c1_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 62         cpu: 62             execType: jit_avx512_I8
conv2d_c1_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 29         cpu: 29             execType: jit_avx512_1x1_I8
conv2d_c1_4/Conv2D            EXECUTED       layerType: Convolution        realTime: 34         cpu: 34             execType: jit_avx512_1x1_I8
conv2d_c2_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 46         cpu: 46             execType: jit_avx512_1x1_I8
conv2d_c2_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 69         cpu: 69             execType: jit_avx512_I8
conv2d_c2_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 37         cpu: 37             execType: jit_avx512_1x1_I8
conv2d_c2_4/Conv2D            EXECUTED       layerType: Convolution        realTime: 62         cpu: 62             execType: jit_avx512_I8
conv2d_c3_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 51         cpu: 51             execType: jit_avx512_1x1_I8
conv2d_c3_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 71         cpu: 71             execType: jit_avx512_I8
conv2d_c3_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 34         cpu: 34             execType: jit_avx512_1x1_I8
conv2d_c3_4/Conv2D            EXECUTED       layerType: Convolution        realTime: 63         cpu: 63             execType: jit_avx512_I8
conv2d_c4_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 64         cpu: 64             execType: jit_avx512_1x1_I8
conv2d_c4_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 111        cpu: 111            execType: jit_avx512_I8
conv2d_c4_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 41         cpu: 41             execType: jit_avx512_1x1_I8
conv2d_c4_4/Conv2D            EXECUTED       layerType: Convolution        realTime: 84         cpu: 84             execType: jit_avx512_I8
conv2d_i10_1/Conv2D           EXECUTED       layerType: Convolution        realTime: 42         cpu: 42             execType: jit_avx512_1x1_I8
conv2d_i10_2/Conv2D           EXECUTED       layerType: Convolution        realTime: 65         cpu: 65             execType: jit_avx512_I8
conv2d_i10_3/Conv2D           EXECUTED       layerType: Convolution        realTime: 39         cpu: 39             execType: jit_avx512_1x1_I8
conv2d_i11_1/Conv2D           EXECUTED       layerType: Convolution        realTime: 54         cpu: 54             execType: jit_avx512_1x1_I8
conv2d_i11_2/Conv2D           EXECUTED       layerType: Convolution        realTime: 104        cpu: 104            execType: jit_avx512_I8
conv2d_i11_3/Conv2D           EXECUTED       layerType: Convolution        realTime: 42         cpu: 42             execType: jit_avx512_1x1_I8
conv2d_i12_1/Conv2D           EXECUTED       layerType: Convolution        realTime: 51         cpu: 51             execType: jit_avx512_1x1_I8
conv2d_i12_2/Conv2D           EXECUTED       layerType: Convolution        realTime: 99         cpu: 99             execType: jit_avx512_I8
conv2d_i12_3/Conv2D           EXECUTED       layerType: Convolution        realTime: 44         cpu: 44             execType: jit_avx512_1x1_I8
conv2d_i12_3/Conv2D_ScaleR... EXECUTED       layerType: Reorder            realTime: 21         cpu: 21             execType: reorder_I8
conv2d_i1_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 30         cpu: 30             execType: jit_avx512_1x1_I8
conv2d_i1_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 58         cpu: 58             execType: jit_avx512_I8
conv2d_i1_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 29         cpu: 29             execType: jit_avx512_1x1_I8
conv2d_i2_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 27         cpu: 27             execType: jit_avx512_1x1_I8
conv2d_i2_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 58         cpu: 58             execType: jit_avx512_I8
conv2d_i2_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 30         cpu: 30             execType: jit_avx512_1x1_I8
conv2d_i3_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 34         cpu: 34             execType: jit_avx512_1x1_I8
conv2d_i3_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 53         cpu: 53             execType: jit_avx512_I8
conv2d_i3_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 34         cpu: 34             execType: jit_avx512_1x1_I8
conv2d_i4_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 31         cpu: 31             execType: jit_avx512_1x1_I8
conv2d_i4_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 55         cpu: 55             execType: jit_avx512_I8
conv2d_i4_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 35         cpu: 35             execType: jit_avx512_1x1_I8
conv2d_i5_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 31         cpu: 31             execType: jit_avx512_1x1_I8
conv2d_i5_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 53         cpu: 53             execType: jit_avx512_I8
conv2d_i5_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 36         cpu: 36             execType: jit_avx512_1x1_I8
conv2d_i6_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 43         cpu: 43             execType: jit_avx512_1x1_I8
conv2d_i6_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 69         cpu: 69             execType: jit_avx512_I8
conv2d_i6_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 40         cpu: 40             execType: jit_avx512_1x1_I8
conv2d_i7_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 45         cpu: 45             execType: jit_avx512_1x1_I8
conv2d_i7_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 67         cpu: 67             execType: jit_avx512_I8
conv2d_i7_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 37         cpu: 37             execType: jit_avx512_1x1_I8
conv2d_i8_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 43         cpu: 43             execType: jit_avx512_1x1_I8
conv2d_i8_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 70         cpu: 70             execType: jit_avx512_I8
conv2d_i8_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 39         cpu: 39             execType: jit_avx512_1x1_I8
conv2d_i9_1/Conv2D            EXECUTED       layerType: Convolution        realTime: 46         cpu: 46             execType: jit_avx512_1x1_I8
conv2d_i9_2/Conv2D            EXECUTED       layerType: Convolution        realTime: 68         cpu: 68             execType: jit_avx512_I8
conv2d_i9_3/Conv2D            EXECUTED       layerType: Convolution        realTime: 39         cpu: 39             execType: jit_avx512_1x1_I8

 

 

Reply