topic FP32 inference is on GPU. in Intel® Distribution of OpenVINO™ Toolkit

FP16 overflow

Kiyoshi_F_Intel — Tue, 12 Feb 2019 11:03:38 GMT

Hello, Everyone

My customer is now developing their product using OpenVINO R4. He is using his own dataset.

During the training, he trained by no scale value and no mean value on Caffe.

He converted the training model from FP32 to FP16 by using mo.py script.

However, he encountered a lot of inference errors when using FP16.

He used "cross-check tool" included in the OpenVINO package to compare the result between FP32 and FP16, and found overflow in some layers when using FP16 mode.

The overflow is not surprising because FP16 range is much smaller than FP32.

In this case, could you let me know what my customer has to do?

> errors when using FP16.

nikos1 — Wed, 13 Feb 2019 17:29:06 GMT

> errors when using FP16.

Assuming FP32 is on CPU. May I ask what inference device is using for FP16 ? Also what topology?

FP32 inference is on GPU.

Kiyoshi_F_Intel — Thu, 14 Feb 2019 11:32:16 GMT

FP32 inference is on GPU.

FP16 Inference device is Arria 10 on Mustang-F100-A10.

The topology is GoogleNet.

Shubha_R_Intel — Thu, 14 Feb 2019 15:38:00 GMT

When MO (Model Optimizer) converts weights of a model from FP32 to FP16 it checks for maximum value overflow (in fact MO uses numpy function astype which performs the values conversion).

If the value overflow occurs then the following error is printed (however the IR is generated):

[ ERROR ] 83 elements of 189 were clipped to infinity while converting a blob for node [['conv2d_transpose']] to <class 'numpy.float16'>.

But the MO cannot guarantee that the overflow will not occur during inference. For example, you can create a network that will sum 2 values. Even though both of them are below float16 max value the sum of them will be more than the limit.

It is not possible to normalize weights values before converting because it will significantly decrease prediction results (or most probably completely break the topology) so there is no such feature in MO.

The recommendation to the customer would be to re-train the model with scaled input values to, for example, [0, 1] or [-1, 1] segment.

Hello, Shubha

Kiyoshi_F_Intel — Mon, 18 Feb 2019 07:40:59 GMT

Hello, Shubha

Thank you so much for your answer. It is helpful and I reported it to my customer.

Hello Shubba, please specify

Leini__Mikk — Mon, 08 Apr 2019 21:12:20 GMT

Hello Shubba, please specify how to "re-train the model with scaled input values".