Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Optimization problem under Linux and Xeon E5620 CPU

Yongfu_W_
초급자
1,773 조회수

I am now trying to build a Fortran program with Fortran compiler (version 2013.3.163). I choose -O2 (or -O3), -xHost option in the building process. The built program works fine under ubuntu 12.04 (kernel 3.5.0-26) in the vmware virtual machine in my laptop (the CPU is i7-3720QM), the speed is 3.19 times faster than the version of unoptimized (-O0, -g option used). However, when I run the program under ubuntu 12.04 (kernel 3.5.0-26) in a HP Z800 workstation (with two Xeon E5620 CPU), the program speed is the same as that of unoptimized, i.e., the optimization does not work with E5620 CPU.

I have tried the optimized program built with gfortran, it works fine under E5620, although not so fast as the program built with ifort (which can work in my laptop). I am really confused why the optimization does not work in E5620. I am a new bee with ifort and do not know how to debug and diagnose the problem. Can anyone give me some suggestions? Thanks a lot! My email address is: wyffrank@gmail.com.

0 포인트
5 응답
Yongfu_W_
초급자
1,773 조회수

BTW, the program needs large local arrays, so I use ulimit -s unlimited to enlarge the stack size. I don't believe this would affect the optimization, since it works fine in my laptop.

0 포인트
Yongfu_W_
초급자
1,773 조회수

BTW, the program needs large local arrays, so I use ulimit -s unlimited to enlarge the stack size. I don't believe this would affect the optimization, since it works fine in my laptop.

0 포인트
jimdempseyatthecove
명예로운 기여자 III
1,773 조회수

>> I choose -O2 (or -O3), -xHost option ...my laptop (the CPU is i7-3720QM)...). I am really confused why the optimization does not work in E5620.

Are you saying you compiled the program on your laptop (CPU is i7-3720QM) with -xHost.
Then copied the execuitable program to E5620 CPU?

If yes, then the Host CPU during compilation is not the Host CPU during run time.
To fix this problem for portable program remove the -xHost.
Or recompile on E5620 with -xHost

Jim Dempsey

0 포인트
Yongfu_W_
초급자
1,773 조회수

I compiled the program on E5620 with -xHost too, but it does not work.

0 포인트
jimdempseyatthecove
명예로운 기여자 III
1,773 조회수

Is your compute intensive parallel code performing a high degree of atomic, mutex/lock, critical or library calls containing the same (e.g. RAN DRAN)?

Your notebook (one CPU) has a single Last Level Cache (LLC/L3) and single memory system.
Your Z800 workstation (two CPUs) has two Last Level Cache (LLC/L3) and two memory systems (one per CPU).

The atomic, mutex/lock, critical or library calls generally take much longer when system has multiple LLC and/or memory systems.

Do you have a profiler? If so, then this might identify the section of code that is causing the bottleneck.

Jim Dempsey

0 포인트
응답