- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I mainly use intel fortran (I have a bunch of legacy code that works fine with that compiler and it seems a massive risk to use another compiler). I mainly run my codes on a cluster (with mainly Xeon type processors) and use my laptop (intel chip too) to sometimes run some smaller codes and debug code (using VS community + Intel OneAPI).
Now I am planning to buy a new laptop. And I have my eye on the new surface laptops: https://www.microsoft.com/en-us/surface/devices/surface-laptop-7th-edition 1
Does anyone know whether the intel fortran compilers run well on snapdragon chips? In particular for OpenMP type codes.
Thank you.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel compilers do not generate code for the ARM architecture. If you wish to use the Intel compiler, stick to x86, and for best results, Intel processors.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Steve. Is there any benchmark between AMD and Intel processors? E.g. for similar cores and core speed what are the differences in execution time using openMP?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are too many variables. You can find single and multithread benchmarks online comparing Intel and AMD processors, though not using Intel compilers. Keep in mind that some Intel compiler optimizations occur only when you tell the compiler you are using an Intel processor (or auto-dispatch is being used and the program is running on an Intel processor.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a good beginner question. As an experienced OpenMP user, I too would be interested in some responses.
OpenMP performance is effected by many different things, which varies depending on the calculation approach.
My experience is mostly limited to !$OMP PARALLEL DO usage for large vectors. While the processor has a lot of influence on performance, it does not matter if it is AMD or Intel. Either processor will give a workable solution.
I have only used low cost hardware solutions; I7 or Ryzen processors, both with dual channel memory, where my bottleneck has been memory access speeds/bandwidth. My experience is both brands suffer from this limiting issue.
I would suggest that the many thread OMP solutions have not been helped by overclocking > overheating > fluctuating clock speeds that leads to cache usage inefficiency in my particular methods.
So, there can be a greater issue with a balanced hardware approach and not just the brand of processor.
While others may (hopefully) disagree, this should show the use of OpenMP has a variety of influences, which can be better clarified by experience with your particular problems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Standardized benchmarks only provide the relative performance for those specific benchmark programs. IOW not necessarily the relative performance to expect of your program. As @Steve_Lionel and @John_Campbell are talking about, there are too many factors to consider which CPU and system configuration yields the best performance.
Generally, the machine code optimizations (CPU instruction selection and sequencing) of Intel compilers will be better for Intel CPU's. With non-Intel CPU's the compiler can only use "generic" instructions and sequencing. As @John_Campbell stated, code optimization is only one factor that will affect performance. Number of cores can be beneficial or detrimental. Too many cores on too small of loops is generally detrimental (thread coordination overhead). When your usage fits in L2 cache of available cores, then high thread count may be best. When your application is heavily accessing RAM (non-cached data) then using more cores than memory channels (or a small multiple of memory channels) might be detrimental.
CPU clock speeds (base and turbo boost)
IPC (instructions per clock)
Efficiency cores verses performance cores
size of caches
number of clock cycles to access each level of cache
memory speed (possibly width)
number of memory channels
code being amenable to vectorization
system cooling (which may affect throttling)
Then, how your code interacts with the above.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page