Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Intel Fortran on Snapdragon chips

NC1
Beginner
1,187 Views

Hi all,

I mainly use intel fortran (I have a bunch of legacy code that works fine with that compiler and it seems a massive risk to use another compiler). I mainly run my codes on a cluster (with mainly Xeon type processors) and use my laptop (intel chip too) to sometimes run some smaller codes and debug code (using VS community + Intel OneAPI).

Now I am planning to buy a new laptop. And I have my eye on the new surface laptops: https://www.microsoft.com/en-us/surface/devices/surface-laptop-7th-edition 1

Does anyone know whether the intel fortran compilers run well on snapdragon chips? In particular for OpenMP type codes.

Thank you.

0 Kudos
5 Replies
Steve_Lionel
Honored Contributor III
1,134 Views

Intel compilers do not generate code for the ARM architecture. If you wish to use the Intel compiler, stick to x86, and for best results, Intel processors.

0 Kudos
NC1
Beginner
1,064 Views

Thanks Steve. Is there any benchmark between AMD and Intel processors? E.g. for similar cores and core speed what are the differences in execution time using openMP?

0 Kudos
Steve_Lionel
Honored Contributor III
1,034 Views

There are too many variables. You can find single and multithread benchmarks online comparing Intel and AMD processors, though not using Intel compilers. Keep in mind that some Intel compiler optimizations occur only when you tell the compiler you are using an Intel processor (or auto-dispatch is being used and the program is running on an Intel processor.)

John_Campbell
New Contributor II
1,000 Views

@NC1 

This is a good beginner question. As an experienced OpenMP user, I too would be interested in some responses.

OpenMP performance is effected by many different things, which varies depending on the calculation approach. 

My experience is mostly limited to !$OMP PARALLEL DO usage for large vectors. While the processor has a lot of influence on performance, it does not matter if it is AMD or Intel. Either processor will give a workable solution.

I have only used low cost hardware solutions; I7 or Ryzen processors, both with dual channel memory, where my bottleneck has been memory access speeds/bandwidth. My experience is both brands suffer from this limiting issue.

I would suggest that the many thread OMP solutions have not been helped by overclocking > overheating > fluctuating clock speeds that leads to cache usage inefficiency in my particular methods.

So, there can be a greater issue with a balanced hardware approach and not just the brand of processor.

While others may (hopefully) disagree, this should show the use of OpenMP has a variety of influences, which can be better clarified by experience with your particular problems.

jimdempseyatthecove
Honored Contributor III
921 Views

Standardized benchmarks only provide the relative performance for those specific benchmark programs. IOW not necessarily the relative performance to expect of your program. As @Steve_Lionel and @John_Campbell are talking about, there are too many factors to consider which CPU and system configuration yields the best performance. 

 

Generally, the machine code optimizations (CPU instruction selection and sequencing) of Intel compilers will be better for Intel CPU's. With non-Intel CPU's the compiler can only use "generic" instructions and sequencing. As @John_Campbell stated, code optimization is only one factor that will affect performance. Number of cores can be beneficial or detrimental. Too many cores on too small of loops is generally detrimental (thread coordination overhead). When your usage fits in L2 cache of available cores, then high thread count may be best. When your application is heavily accessing RAM (non-cached data) then using more cores than memory channels (or a small multiple of memory channels) might be detrimental.

 

CPU clock speeds (base and turbo boost)

IPC (instructions per clock)

Efficiency cores verses performance cores

size of caches

number of clock cycles to access each level of cache

memory speed (possibly width)

number of memory channels

code being amenable to vectorization

system cooling (which may affect throttling) 

 

Then, how your code interacts with the above.

 

Jim Dempsey

Reply