- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I'm trying to quantize FaceMesh model with POT tool using following config (based on default config example):
{
/* Model parameters */
"model": {
"model_name": "facemesh", // Model name
"model": "./facemesh.xml", // Path to model (.xml format)
"weights": "./facemesh.bin" // Path to weights (.bin format)
},
/* Parameters of the engine used for model inference */
"engine": {
/* Simplified mode */
"type": "simplified",
"data_source": "./data"
},
/* Optimization hyperparameters */
"compression": {
"target_device": "CPU",
"algorithms": [
{
"name": "DefaultQuantization",
"params": {
"preset": "performance",
"stat_subset_size": 300,
"shuffle_data": false
}
}
]
}
}
Quantized model becomes ~4 times smaller, although its inference time increases ~37%.
Unquantized model benchmark log:
[Step 1/11] Parsing and validating input arguments
/opt/intel/openvino_2020.4.287/python/python3.6/openvino/tools/benchmark/main.py:29: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(" -nstreams default value is determined automatically for a device. "
[ WARNING ] -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine:
API version............. 2.1.2020.4.0-359-21e092122f4-releases/2020/4
[ INFO ] Device info
CPU
MKLDNNPlugin............ version 2.1
Build................... 2020.4.0-359-21e092122f4-releases/2020/4
[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading the Intermediate Representation network
[ INFO ] Read network took 31.38 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 199.60 ms
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'image' precision U8, dimensions (NCHW): 1 3 192 192
/opt/intel/openvino_2020.4.287/python/python3.6/openvino/tools/benchmark/utils/inputs_filling.py:71: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn("No input files were given: all inputs will be filled with random values!")
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'image' with random values (image is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'image' with random values (image is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'image' with random values (image is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'image' with random values (image is expected)
[Step 10/11] Measuring performance (Start inference asyncronously, 4 inference requests using 4 streams for CPU, limits: 60000 ms duration)
[Step 11/11] Dumping statistics report
Count: 64424 iterations
Duration: 60006.06 ms
Latency: 3.60 ms
Throughput: 1073.62 FPS
Quantized model benchmark log:
[Step 1/11] Parsing and validating input arguments
/opt/intel/openvino_2020.4.287/python/python3.6/openvino/tools/benchmark/main.py:29: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(" -nstreams default value is determined automatically for a device. "
[ WARNING ] -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine:
API version............. 2.1.2020.4.0-359-21e092122f4-releases/2020/4
[ INFO ] Device info
CPU
MKLDNNPlugin............ version 2.1
Build................... 2020.4.0-359-21e092122f4-releases/2020/4
[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading the Intermediate Representation network
[ INFO ] Read network took 67.49 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 294.29 ms
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'image' precision U8, dimensions (NCHW): 1 3 192 192
/opt/intel/openvino_2020.4.287/python/python3.6/openvino/tools/benchmark/utils/inputs_filling.py:71: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn("No input files were given: all inputs will be filled with random values!")
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'image' with random values (image is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'image' with random values (image is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'image' with random values (image is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'image' with random values (image is expected)
[Step 10/11] Measuring performance (Start inference asyncronously, 4 inference requests using 4 streams for CPU, limits: 60000 ms duration)
[Step 11/11] Dumping statistics report
Count: 48160 iterations
Duration: 60007.22 ms
Latency: 4.93 ms
Throughput: 802.57 FPS
Could you check please, is it expected result for such model?
BR,
Alexey.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexey,
Thanks for reaching out.
I tested your xml file for both quantized and unquantized. I am getting the same result as you.
OpenVINO quantization depends on specific libraries and devices. It's probably due to unsupported layers in 8-bit integer computation mode for your model to be quantized.
You can refer here for more details: https://github.com/intel/webml-polyfill/issues/1239
Also please check the topologies that have been validated for 8-bit inference feature here.
Regards,
Aznie
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
Having the same issue with exact the same config file.
Waiting for an answer from intel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexey,
Thanks for reaching out.
I tested your xml file for both quantized and unquantized. I am getting the same result as you.
OpenVINO quantization depends on specific libraries and devices. It's probably due to unsupported layers in 8-bit integer computation mode for your model to be quantized.
You can refer here for more details: https://github.com/intel/webml-polyfill/issues/1239
Also please check the topologies that have been validated for 8-bit inference feature here.
Regards,
Aznie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexey,
This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.
Best Regards,
Aznie
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page