- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quantized a model with the following parameters:
"compression": {
"target_device": "CPU",
"algorithms": [
{
"name": "AccuracyAwareQuantization",
"params": {
"metric_subset_ratio": 1,
"ranking_subset_size": 300,
"max_iter_num": 500,
"maximal_drop": 0.01,
"drop_type": "relative",
"base_algorithm": "DefaultQuantization",
"use_prev_if_drop_increase": true,
"range_estimator": {
"preset": "default"
}
}
}
]
}
The quantized model bin works fine, but at max it uses only 1 core during inference no matter how many cores are visible and free.
IE version: 2.1.2020.4.0-359-21e092122f4-releases/2020/4
Loaded CPU plugin version:
CPU - MKLDNNPlugin: 2.1.2020.4.0-359-21e092122f4-releases/2020/4
INFO:compression.statistics.collector:Start computing statistics for algorithms : AccuracyAwareQuantization
INFO:compression.statistics.collector:Computing statistics finished
INFO:compression.pipeline.pipeline:Start algorithm: AccuracyAwareQuantization
INFO:compression.algorithms.quantization.accuracy_aware.algorithm:Start original model inference
INFO:compression.engines.ac_engine:Start inference of 5642 images
Total dataset size: 5642
1000 / 5642 processed in 64.319s
2000 / 5642 processed in 63.766s
3000 / 5642 processed in 64.391s
4000 / 5642 processed in 66.509s
5000 / 5642 processed in 64.553s
5642 objects processed in 364.530 seconds
INFO:compression.engines.ac_engine:Inference finished
INFO:compression.algorithms.quantization.accuracy_aware.algorithm:Baseline metrics: {'map': 0.45369710716845546}
INFO:compression.algorithms.quantization.accuracy_aware.algorithm:Start quantization
INFO:compression.statistics.collector:Start computing statistics for algorithms : ActivationChannelAlignment
INFO:compression.statistics.collector:Computing statistics finished
INFO:compression.statistics.collector:Start computing statistics for algorithms : MinMaxQuantization,FastBiasCorrection
INFO:compression.statistics.collector:Computing statistics finished
INFO:compression.algorithms.quantization.accuracy_aware.algorithm:Start compressed model inference
INFO:compression.engines.ac_engine:Start inference of 5642 images
Total dataset size: 5642
1000 / 5642 processed in 845.572s
2000 / 5642 processed in 843.301s
3000 / 5642 processed in 843.223s
4000 / 5642 processed in 843.403s
5000 / 5642 processed in 843.912s
5642 objects processed in 4761.327 seconds
INFO:compression.engines.ac_engine:Inference finished
INFO:compression.algorithms.quantization.accuracy_aware.algorithm:Fully quantized metrics: {'map': 0.4520465728234177}
INFO:compression.algorithms.quantization.accuracy_aware.algorithm:Accuracy drop: {'map': 0.0016505343450377574}
INFO:compression.pipeline.pipeline:Finished: AccuracyAwareQuantization
===========================================================================
Are there any solutions to this?
Thanks
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This topic seems to be a duplicate of https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/Model-inference-time-increases-drastically-after-quantization/m-p/1193741
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, this post was rejected so I created the other post. Then when it came back up I couldn't delete this one. Sorry about that.
But that one doesn't have a solution either
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page