Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Performance on P-core vs E-core

New Contributor I


I've had a long history of compiling a large Fortran program and running it. The thing is, it calculates the kind of things where, if some intermediate result is 0.9999, the overall result looks one way. If the same intermediate result is 1.0001, the overall result looks completely different. This is completely expected, and true to what we're trying to model, don't despair. A little explanation goes a long way with our users.

We've tried to pick out optimization flags (AVX2, etc.) and things to make the results repeatable on different cores. A lot of times we succeed.

We have a simple job manager that submits jobs (Fortran executable) in batch on all the available CPUs on a laptop. (We count hyper-threaded as two cores - we want to use all of them, and we want all the jobs to finish at the same time).

Some time ago I found out that there are laptops with a I7-1265U that has a little bit of P-Cores, and a little bit of E-cores.


No, really, what.

What can I expect from my job manager submitting jobs? If a job goes to a P-core vs E-core, the job will run in a different amount of time?

Should I expect that the same job on a P-core versus an E-core on the same laptop will produce different results? (See terror stories about 0.9999 vs 1.0001 above).

0 Kudos
1 Reply
Honored Contributor III

>>Should I expect that the same job on a P-core versus an E-core on the same laptop will produce different results? 

Same code, same data (single threaded) should produce the same results provided the code isn't sensitive to system calls that vary depending on time. E.g. you try to tune code depending on number of clock ticks it took on the first pass.

Multi-threaded programs (single process, multiple threads) often is indeterminant with respect to which thread gets what/how much work to do.


*** However ***

P-cores support AVX512 (on CPU's with AVX512), whereas E-cores, from my understanding, do not.

If this is the case, compile your programs to target AVX2 alone.

The difference between the results running AVX512 and AVX2 can result from two different code paths as well as potentially different instructions supported (and taken advantaqge of).


Jim Dempsey

0 Kudos