Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

CPU inference, comparing old and new cpu

brian2
Beginner
994 Views

Hi Forum

 

I'm testing a production solution for inference on the edge.

(We have some simple CNN models which has been trained using TF2 and converted to

openvino via the model optimizer, mo.)

Everything works, and have been for long, but my assumption that a newer intel cpu would be faster has not satisified my hopes.

 

On developer machine : i7-11700 inference result is 7ms.

On production machine : i5-13600K inference result is 9ms.

 

Results are mean results from many iterations of

request.set_input_tensor( wrapMat2Tensor( M ) );
request.start_async();
request.wait();

 

On both machines the

/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

is set to performance.

 

I naively thought that I would get a few milliseconds off of the inference time. (newer core, more GHz, more cache, more more more .. ).

 

What can I do/test/verify/change?

- is my version (i5 13-th gen) really slower for this kind of job as the older i7 11'th gen?

- must I buy a i7 or i9 to really see some performace boost?

- is openvino (2023) using the 'performance cores' pr default? If not could you direct me towards some API (C++) which will enable this?

- IF I need the extra performance (AND YES I DO!) what hardware should I be looking at?

 

Hope to get a few pointers, Have a Good one!

/Brian

 

 

0 Kudos
1 Solution
Aznie_Intel
Moderator
959 Views

Hi Brian2,

 

Thanks for reaching out.

 

Yes, you may set the specific core type base on your requirement. However, the slow inference on an old CPU is expected based on the system configuration and hardware specifications. Before upgrading your hardware, I would advise you to refer to the Intel® Distribution of OpenVINO™ toolkit Benchmark Results for the inference performance on a specified hardware configuration. Additionally, it also contains a thorough explanation of factors that affect IR model performance.

 

 

Regards,

Aznie


View solution in original post

0 Kudos
6 Replies
brian2
Beginner
980 Views

Regarding P-Cores and E-Cores:

I found the docu at : https://docs.openvino.ai/2023.0/groupov_runtime_cpp_prop_api.html#detailed-documentation

Using

ov::CompiledModel compiled_model = core.compile_model(model, "CPU",
                                                                            ov::hint::performance_mode(ov::hint::PerformanceMode::LATENCY),

                                                                            ov::hint::scheduling_core_type(ov::hint::SchedulingCoreType::ECORE_ONLY));

 

I get 19ms

and with

                                                                            ov::hint::scheduling_core_type(ov::hint::SchedulingCoreType::PCORE_ONLY));

 

I get the 9ms.

Ie It is already utilizing the P-Cores in my default setup.

0 Kudos
Aznie_Intel
Moderator
960 Views

Hi Brian2,

 

Thanks for reaching out.

 

Yes, you may set the specific core type base on your requirement. However, the slow inference on an old CPU is expected based on the system configuration and hardware specifications. Before upgrading your hardware, I would advise you to refer to the Intel® Distribution of OpenVINO™ toolkit Benchmark Results for the inference performance on a specified hardware configuration. Additionally, it also contains a thorough explanation of factors that affect IR model performance.

 

 

Regards,

Aznie


0 Kudos
brian2
Beginner
954 Views
Hi Aznie
And Thanks for reaching out..
Yes I would also expect a slow inference on an old cpu.

BUT my topic is the inverse.

I did not expect a slower inference on a newer cpu. (11th gen i7 faster than 13th gen i5)

I know that you can not investigate my topic in details without my system, BUT maybe you had some trick up your sleeve or some points to help me locate why I see these results ..
I aim at a simple production system with just a single cpu, but maybe I do need a top of the line for my purpose ..

0 Kudos
Aznie_Intel
Moderator
933 Views

Hi Brian2,

 

Sorry for the misunderstanding. Model Optimizer can produce an IR with different precision. Which precision that you tested?

Generally, performance means how fast the model is in deployment with two key metrics as a measurement which are latency and throughput. You could try leveraging Throughput and latency by using OpenVINO performance hints.

 

ov::hint::performance_mode(ov::hint::PerformanceMode::THROUGHPUT));

 

Refer to Performance Hints: Latency and Throughput documentation.

 

Hope this helps.



Regards,

Aznie


0 Kudos
brian2
Beginner
918 Views
Hi Aznie
Yeah I have tried different settings, and compared on both machines, and tried different designs where I perform single inference in sequence as well as constructing a set of requests and running them in parallel .. all same results.

I have tried several different builds settings, building from source and installing the runtime via apt ..

Anyway I have accepted your answer, it did point me to a resource which I had skipped ..
Thanks .. have a good one
/Brian
0 Kudos
Aznie_Intel
Moderator
891 Views

Hi Brian2,


This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question. 



Regards,

Aznie


0 Kudos
Reply