- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi I am looking for int8 demo example for inference. I tried quantizing the resnet50 model from FP32 to int8 using calibration tool. By checking the size of model files (.bin), I see both file size as same. Can anybody share an example where he/she has attempted to convert FP32 model to INT8 and then Run it via Inference Engine ?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Malhar. Did you make an attempt with one of the samples under inference_engine/samples with your newly INT8 calibrated model ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Malhar The calibration tool does calibration for activation. The weights are not saved into .bin and .xml. Instead, when the quantized model loadded by inference engine, the fp32 weights will be normalized into int8. so, the converted .bin is the same size of fp32 bin.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Malhar, Yuanyuan answered your question but this online document has all the details:
https://docs.openvinotoolkit.org/R5/_docs_IE_DG_Int8Inference.html
Thanks !
Shubha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Shubha R,
I am running "classification_sample" example and by providing the -pc i get the list of layers executed with JIT.
adding the details from performance counts --
conv2d_1/Conv2D EXECUTED layerType: Convolution realTime: 272 cpu: 272 execType: jit_avx512_I8
conv2d_c1_1/Conv2D EXECUTED layerType: Convolution realTime: 15 cpu: 15 execType: jit_avx512_1x1_I8
conv2d_c1_2/Conv2D EXECUTED layerType: Convolution realTime: 62 cpu: 62 execType: jit_avx512_I8
conv2d_c1_3/Conv2D EXECUTED layerType: Convolution realTime: 29 cpu: 29 execType: jit_avx512_1x1_I8
conv2d_c1_4/Conv2D EXECUTED layerType: Convolution realTime: 34 cpu: 34 execType: jit_avx512_1x1_I8
conv2d_c2_1/Conv2D EXECUTED layerType: Convolution realTime: 46 cpu: 46 execType: jit_avx512_1x1_I8
conv2d_c2_2/Conv2D EXECUTED layerType: Convolution realTime: 69 cpu: 69 execType: jit_avx512_I8
conv2d_c2_3/Conv2D EXECUTED layerType: Convolution realTime: 37 cpu: 37 execType: jit_avx512_1x1_I8
conv2d_c2_4/Conv2D EXECUTED layerType: Convolution realTime: 62 cpu: 62 execType: jit_avx512_I8
conv2d_c3_1/Conv2D EXECUTED layerType: Convolution realTime: 51 cpu: 51 execType: jit_avx512_1x1_I8
conv2d_c3_2/Conv2D EXECUTED layerType: Convolution realTime: 71 cpu: 71 execType: jit_avx512_I8
conv2d_c3_3/Conv2D EXECUTED layerType: Convolution realTime: 34 cpu: 34 execType: jit_avx512_1x1_I8
conv2d_c3_4/Conv2D EXECUTED layerType: Convolution realTime: 63 cpu: 63 execType: jit_avx512_I8
conv2d_c4_1/Conv2D EXECUTED layerType: Convolution realTime: 64 cpu: 64 execType: jit_avx512_1x1_I8
conv2d_c4_2/Conv2D EXECUTED layerType: Convolution realTime: 111 cpu: 111 execType: jit_avx512_I8
conv2d_c4_3/Conv2D EXECUTED layerType: Convolution realTime: 41 cpu: 41 execType: jit_avx512_1x1_I8
conv2d_c4_4/Conv2D EXECUTED layerType: Convolution realTime: 84 cpu: 84 execType: jit_avx512_I8
conv2d_i10_1/Conv2D EXECUTED layerType: Convolution realTime: 42 cpu: 42 execType: jit_avx512_1x1_I8
conv2d_i10_2/Conv2D EXECUTED layerType: Convolution realTime: 65 cpu: 65 execType: jit_avx512_I8
conv2d_i10_3/Conv2D EXECUTED layerType: Convolution realTime: 39 cpu: 39 execType: jit_avx512_1x1_I8
conv2d_i11_1/Conv2D EXECUTED layerType: Convolution realTime: 54 cpu: 54 execType: jit_avx512_1x1_I8
conv2d_i11_2/Conv2D EXECUTED layerType: Convolution realTime: 104 cpu: 104 execType: jit_avx512_I8
conv2d_i11_3/Conv2D EXECUTED layerType: Convolution realTime: 42 cpu: 42 execType: jit_avx512_1x1_I8
conv2d_i12_1/Conv2D EXECUTED layerType: Convolution realTime: 51 cpu: 51 execType: jit_avx512_1x1_I8
conv2d_i12_2/Conv2D EXECUTED layerType: Convolution realTime: 99 cpu: 99 execType: jit_avx512_I8
conv2d_i12_3/Conv2D EXECUTED layerType: Convolution realTime: 44 cpu: 44 execType: jit_avx512_1x1_I8
conv2d_i12_3/Conv2D_ScaleR... EXECUTED layerType: Reorder realTime: 21 cpu: 21 execType: reorder_I8
conv2d_i1_1/Conv2D EXECUTED layerType: Convolution realTime: 30 cpu: 30 execType: jit_avx512_1x1_I8
conv2d_i1_2/Conv2D EXECUTED layerType: Convolution realTime: 58 cpu: 58 execType: jit_avx512_I8
conv2d_i1_3/Conv2D EXECUTED layerType: Convolution realTime: 29 cpu: 29 execType: jit_avx512_1x1_I8
conv2d_i2_1/Conv2D EXECUTED layerType: Convolution realTime: 27 cpu: 27 execType: jit_avx512_1x1_I8
conv2d_i2_2/Conv2D EXECUTED layerType: Convolution realTime: 58 cpu: 58 execType: jit_avx512_I8
conv2d_i2_3/Conv2D EXECUTED layerType: Convolution realTime: 30 cpu: 30 execType: jit_avx512_1x1_I8
conv2d_i3_1/Conv2D EXECUTED layerType: Convolution realTime: 34 cpu: 34 execType: jit_avx512_1x1_I8
conv2d_i3_2/Conv2D EXECUTED layerType: Convolution realTime: 53 cpu: 53 execType: jit_avx512_I8
conv2d_i3_3/Conv2D EXECUTED layerType: Convolution realTime: 34 cpu: 34 execType: jit_avx512_1x1_I8
conv2d_i4_1/Conv2D EXECUTED layerType: Convolution realTime: 31 cpu: 31 execType: jit_avx512_1x1_I8
conv2d_i4_2/Conv2D EXECUTED layerType: Convolution realTime: 55 cpu: 55 execType: jit_avx512_I8
conv2d_i4_3/Conv2D EXECUTED layerType: Convolution realTime: 35 cpu: 35 execType: jit_avx512_1x1_I8
conv2d_i5_1/Conv2D EXECUTED layerType: Convolution realTime: 31 cpu: 31 execType: jit_avx512_1x1_I8
conv2d_i5_2/Conv2D EXECUTED layerType: Convolution realTime: 53 cpu: 53 execType: jit_avx512_I8
conv2d_i5_3/Conv2D EXECUTED layerType: Convolution realTime: 36 cpu: 36 execType: jit_avx512_1x1_I8
conv2d_i6_1/Conv2D EXECUTED layerType: Convolution realTime: 43 cpu: 43 execType: jit_avx512_1x1_I8
conv2d_i6_2/Conv2D EXECUTED layerType: Convolution realTime: 69 cpu: 69 execType: jit_avx512_I8
conv2d_i6_3/Conv2D EXECUTED layerType: Convolution realTime: 40 cpu: 40 execType: jit_avx512_1x1_I8
conv2d_i7_1/Conv2D EXECUTED layerType: Convolution realTime: 45 cpu: 45 execType: jit_avx512_1x1_I8
conv2d_i7_2/Conv2D EXECUTED layerType: Convolution realTime: 67 cpu: 67 execType: jit_avx512_I8
conv2d_i7_3/Conv2D EXECUTED layerType: Convolution realTime: 37 cpu: 37 execType: jit_avx512_1x1_I8
conv2d_i8_1/Conv2D EXECUTED layerType: Convolution realTime: 43 cpu: 43 execType: jit_avx512_1x1_I8
conv2d_i8_2/Conv2D EXECUTED layerType: Convolution realTime: 70 cpu: 70 execType: jit_avx512_I8
conv2d_i8_3/Conv2D EXECUTED layerType: Convolution realTime: 39 cpu: 39 execType: jit_avx512_1x1_I8
conv2d_i9_1/Conv2D EXECUTED layerType: Convolution realTime: 46 cpu: 46 execType: jit_avx512_1x1_I8
conv2d_i9_2/Conv2D EXECUTED layerType: Convolution realTime: 68 cpu: 68 execType: jit_avx512_I8
conv2d_i9_3/Conv2D EXECUTED layerType: Convolution realTime: 39 cpu: 39 execType: jit_avx512_1x1_I8
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page