Solved: Slow fully connected layer performance

sbsky · ‎12-21-2020

I'm attempting to run a linear/dense/fully connected based network to openvino. The conversion works, however running the model with openvino is no faster than pytorch. This seems strange to me, as for CNNs I've found that openvino is 10-15x faster than pytorch.

I did some profiling, and it seems that openvino isn't using AVX. This is an except from the profiling log:

Add_2262 EXECUTED layerType: FullyConnected realTime: 8992 cpu: 8992 execType: jit_gemm_FP32
Mul_2283 EXECUTED layerType: Eltwise realTime: 128 cpu: 128 execType: ref_FP32
Add_2284 EXECUTED layerType: Eltwise realTime: 158 cpu: 158 execType: ref_FP32
Total time: 789103 microseconds
Total CPU time: 789103 microseconds

How do I enable the AVX operators for linear layers?

Adli · ‎12-22-2020

Hi sbsky,

Thank you for reaching out to us. There are some functions to check what set of AVX instructions are supported by the CPU. Please refer to the following link:

https://docs.openvinotoolkit.org/latest/ie_plugin_api/group__ie__dev__api__system__conf.html

If CPU supports AVX instructions, they will be used (if it's possible for a certain layer) with no need to enable them.

Besides, please also refer to the CPU optimization topic at the following link: https://docs.openvinotoolkit.org/latest/openvino_docs_optimization_guide_dldt_optimization_guide.html#cpu-checklist

As additional information, you could also check the approximate performance values for certain topologies including ones converted from Pytorch to IR. Please refer to the following link: https://docs.openvinotoolkit.org/downloads/benchmark_files/OV-2021.2-Download-Excel.xlsx

Regards,

Adli

View solution in original post

Adli · ‎12-22-2020

Hi sbsky,

Thank you for reaching out to us. There are some functions to check what set of AVX instructions are supported by the CPU. Please refer to the following link:

https://docs.openvinotoolkit.org/latest/ie_plugin_api/group__ie__dev__api__system__conf.html

If CPU supports AVX instructions, they will be used (if it's possible for a certain layer) with no need to enable them.

Besides, please also refer to the CPU optimization topic at the following link: https://docs.openvinotoolkit.org/latest/openvino_docs_optimization_guide_dldt_optimization_guide.html#cpu-checklist

As additional information, you could also check the approximate performance values for certain topologies including ones converted from Pytorch to IR. Please refer to the following link: https://docs.openvinotoolkit.org/downloads/benchmark_files/OV-2021.2-Download-Excel.xlsx

Regards,

Adli

sbsky · ‎12-23-2020

Hi Adli, the performance values were useful thank you. Would be great if this is displayed more prominently in the documentation

Adli · ‎12-23-2020

Hi sbsky,

This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.

Regards,

Adli