I follow the documentation ( https://docs.openvinotoolkit.org/latest/_inference_engine_samples_benchmark_app_README.html ) to understand the steps:
I run both model optimiser then benchmark app:
python3 mo.py --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
./benchmark_app -m <ir_dir>/googlenet-v1.xml -d CPU -api async -i <INSTALL_DIR>/deployment_tools/demo/car.png --progress true
THEN I used the model optimiser with --data_type FP16 and I run the benchmark again. CPU is a Xeon W-2133
My question is that benchmark gives the same performances with type 32 and type 16... why? I thought it would be slightly better but in reality it is not... So what is the optimiser doing? why is it useful?
I apologize for the late reply.
The CPU plugin now supports data type FP16, but it internally upscales it to FP32. So if you compared the performance of FP16 and FP32 on the CPU, it would have the same result (because they are actually both in FP32).
You would see a difference in performance if you compared the output from a CPU to a GPU or a VPU - FP16 is the best precision for a GPU/VPU target.
I hope this is helpful!