- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have quantized our trained model in a way that it has both INT8 and FP16 weights.
For inference time, we are using OpenVINO's Inference engine to load the model from memory (not from a file path) using the following method:
CNNNetwork ReadNetwork(const std::string& model, const Blob::CPtr& weights) const;
Now, this method requires to create a Tensor (InferenceEngine::TensorDesc) so that we can pass our weights in it and then pass it to the InferenceEngine::make_shared_blob method. The problem here is that TensorDesc does not support Inference::Precision::MIXED.
Here is the error message that is thrown when executing the program:
Cannot make shared blob! The blob type cannot be used to store objects of current precision
So, how should we proceed in order to read a network where the weights are mixed-precision?
Here is a code snippet of how we are loading the model. Not that poBin refers to the binary content of the weights in memory.
Thank you in advance
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Svutran.
Thank you for reaching out to us.
Please share your mixed precisions model, your script, and any relevant information with us for further investigation.
Also, which OpenVINO version did you use to run with the model.
Sincerely,
Zulkifli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry for the late reply, but putting my code here would be a little complicated.
Actually my question is: do we need to set the input and output precision with the following call if we are working in FP16 or mixed-precision which includes FP16:
InferenceEngine::InputsDataMap inputs_info = network.getInputsInfo();
inputs_info.second->setPrecision(Precision::FP16);
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Svutran,
If you are working with FP16 or mixed-precision, particularly for int8-fp16 mixed-precision network, it can be set as below:
inputs_info.second->setPrecision(Precision::FP16);
Sincerely,
Zulkifli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Svutran,
Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.
Sincerely,
Zulkifli
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page