Solved: Re: Intel Fortran on Snapdragon chips

NC1 · ‎07-18-2024

Hi all,

I mainly use intel fortran (I have a bunch of legacy code that works fine with that compiler and it seems a massive risk to use another compiler). I mainly run my codes on a cluster (with mainly Xeon type processors) and use my laptop (intel chip too) to sometimes run some smaller codes and debug code (using VS community + Intel OneAPI).

Now I am planning to buy a new laptop. And I have my eye on the new surface laptops: https://www.microsoft.com/en-us/surface/devices/surface-laptop-7th-edition 1

Does anyone know whether the intel fortran compilers run well on snapdragon chips? In particular for OpenMP type codes.

Thank you.

mecej4 · ‎03-20-2026

Intel compilers will not run at all on Snapdragon chips. The code that Intel compilers generate will not run at all on Snapdragon chips.

Well, there is an extremely small chance that somebody will produce an X64 emulator for your Snapdragon Surface laptop.

View solution in original post

Steve_Lionel · ‎07-18-2024

Intel compilers do not generate code for the ARM architecture. If you wish to use the Intel compiler, stick to x86, and for best results, Intel processors.

Sophia_R10 · ‎03-19-2026

Hello,

Does this mean Intel oneAPI Fortran is not supported on ARM64 (Snapdragon) systems?

I am getting VSIXInstaller errors during installation.

Thank you.

JohnNichols · ‎03-19-2026

Taking SL's long answer and mangling it in LISP I would say the answer is no.

Always deliver bad news in LISP it is such an elegant language.

Fortran is your basic Sherman Tank language, after the 88 gets it.

      program histogram

      implicit none
      real(8) :: array(1000)
      integer :: resultA(161)
      integer :: i,j

      call random_seed()

      call Random_Number(array)
      resultA = 0
      array = 40.0 * array
      do 100 i = 1,1000
            j = int ((array(i) / 0.25) + 0.5);
            resultA(j+1) = resultA(j+1) + 1
            write(*,*)i,array(i),j
100   end do
      write(*,*)"here"

      end

This program replicates the EXCEL histogram, in 0.25 unit steps, I am wondering why the 0.5?

JohnNichols · ‎03-19-2026

Jim: This is the analysis for 500,000 FFT for a beam in Washington, DC. It is the count of the highest FFT amplitude for each FFT.

It is interesting. Just thought I would share.

John

mecej4 · ‎03-20-2026

@JohnNichols : "Why the 0.5?" : The programmer probably added 0.5 before applying the int() function to round up to the nearest integer. You could also use j=nint(4*array(i))

Please note that these replies are no longer relevant to the topic of the thread, which was a question about using Intel Fortran on some non-Intel CPUs.

witwald · ‎03-20-2026

The 0.50 is a little "trick" to implement rounding to the nearest integer via truncation. The INT() function truncates towards zero; it does not perform rounding. Adding 0.5 before truncating shifts the threshold. In Fortran 90's case, it would be simpler to just use NINT() (nearest integer) instead.

NC1 · ‎07-19-2024

Thanks Steve. Is there any benchmark between AMD and Intel processors? E.g. for similar cores and core speed what are the differences in execution time using openMP?

Steve_Lionel · ‎07-20-2024

There are too many variables. You can find single and multithread benchmarks online comparing Intel and AMD processors, though not using Intel compilers. Keep in mind that some Intel compiler optimizations occur only when you tell the compiler you are using an Intel processor (or auto-dispatch is being used and the program is running on an Intel processor.)

John_Campbell · ‎07-20-2024

@NC1

This is a good beginner question. As an experienced OpenMP user, I too would be interested in some responses.

OpenMP performance is effected by many different things, which varies depending on the calculation approach.

My experience is mostly limited to !$OMP PARALLEL DO usage for large vectors. While the processor has a lot of influence on performance, it does not matter if it is AMD or Intel. Either processor will give a workable solution.

I have only used low cost hardware solutions; I7 or Ryzen processors, both with dual channel memory, where my bottleneck has been memory access speeds/bandwidth. My experience is both brands suffer from this limiting issue.

I would suggest that the many thread OMP solutions have not been helped by overclocking > overheating > fluctuating clock speeds that leads to cache usage inefficiency in my particular methods.

So, there can be a greater issue with a balanced hardware approach and not just the brand of processor.

While others may (hopefully) disagree, this should show the use of OpenMP has a variety of influences, which can be better clarified by experience with your particular problems.

jimdempseyatthecove · ‎07-22-2024

Standardized benchmarks only provide the relative performance for those specific benchmark programs. IOW not necessarily the relative performance to expect of your program. As @Steve_Lionel and @John_Campbell are talking about, there are too many factors to consider which CPU and system configuration yields the best performance.

Generally, the machine code optimizations (CPU instruction selection and sequencing) of Intel compilers will be better for Intel CPU's. With non-Intel CPU's the compiler can only use "generic" instructions and sequencing. As @John_Campbell stated, code optimization is only one factor that will affect performance. Number of cores can be beneficial or detrimental. Too many cores on too small of loops is generally detrimental (thread coordination overhead). When your usage fits in L2 cache of available cores, then high thread count may be best. When your application is heavily accessing RAM (non-cached data) then using more cores than memory channels (or a small multiple of memory channels) might be detrimental.

CPU clock speeds (base and turbo boost)

IPC (instructions per clock)

Efficiency cores verses performance cores

size of caches

number of clock cycles to access each level of cache

memory speed (possibly width)

number of memory channels

code being amenable to vectorization

system cooling (which may affect throttling)

Then, how your code interacts with the above.

Jim Dempsey

mecej4 · ‎03-20-2026

Intel compilers will not run at all on Snapdragon chips. The code that Intel compilers generate will not run at all on Snapdragon chips.

Well, there is an extremely small chance that somebody will produce an X64 emulator for your Snapdragon Surface laptop.