- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
If I "downscale" my FP32 model with the
--data_type = FP16
option, according to the docs the model weights and biases for "intermediate" layers are quantized to FP16. I presume "intermediate" here refers to all the layers of the model other than the input and output layers which I presume will stay at FP32.
So, is my input and subsequent output after each layer automatically downscaled to FP16 while passing through the network? And upscaled back to FP32 at the output layer?
How does specifying data type input and output layers using
auto network = network_reader.getNetwork(); /** Taking information about all topology inputs **/ InferenceEngine::InputsDataMap input_info(network.getInputsInfo()); /** Taking information about all topology outputs **/ InferenceEngine::OutputsDataMap output_info(network.getOutputsInfo()); /** Iterating over all input info**/ for (auto &item : input_info) { auto input_data = item.second; input_data->setPrecision(Precision::FP16); }
play with this pipeline? In that case, is there no automatic quantization of the input layer?
Thanks.
Link Copied
0 Replies

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page