I have a legacy Fortran program and a new modern Fortran program. Both of them use the same legacy subroutine. The only compilation flag added was -O3. I'm using the same compiler.
When I run them the legacy program is 4.5 faster than the new program.
I used Intel VTune Ampifier and create an basic hotspots analysis for them. The only point of difference in cpu usage was in the same legacy subroutine. The other part of the code uses pretty much the same cpu time.
This legacy subroutine calls pow and log intrinsic functions but when it is used in the new program uses way more cpu time.
Why does this happen? I'm using the same subroutine, same flags but the cpu time is very different. Is there any difference in the way Intel Fortran compiles legacy and modern Fortran code? are the intrinsic functions the same? Is there any option I can use during compilation?
Thank you very much for your help,
If it is the same compiler and same legacy subroutine (different sources calling same subroutine), then I suspect the old program has vectorized and/or inlined the subroutine, whereas the new "modern" code for some reason has not. This can easily be determined using the vectorization reports (or Dissassembly view in VTune).