Legacy subroutine performing really bad in modern code

Hector_B_ · ‎11-03-2016

Hi Everyone,

I have a legacy Fortran program and a new modern Fortran program. Both of them use the same legacy subroutine. The only compilation flag added was -O3. I'm using the same compiler.

When I run them the legacy program is 4.5 faster than the new program.

I used Intel VTune Ampifier and create an basic hotspots analysis for them. The only point of difference in cpu usage was in the same legacy subroutine. The other part of the code uses pretty much the same cpu time.

Old program:

New program:

This legacy subroutine calls pow and log intrinsic functions but when it is used in the new program uses way more cpu time.

Why does this happen? I'm using the same subroutine, same flags but the cpu time is very different. Is there any difference in the way Intel Fortran compiles legacy and modern Fortran code? are the intrinsic functions the same? Is there any option I can use during compilation?

Thank you very much for your help,

Hector

jimdempseyatthecove · ‎11-05-2016

If it is the same compiler and same legacy subroutine (different sources calling same subroutine), then I suspect the old program has vectorized and/or inlined the subroutine, whereas the new "modern" code for some reason has not. This can easily be determined using the vectorization reports (or Dissassembly view in VTune).

Jim Dempsey