Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28486 Discussions

contention when running multiple copies of the compiler

Chris_Payne
Beginner
340 Views
Our codebase has ~10,500 source files, so our build process runs up to 24 copies of the compiler concurrently. The build server is a 12 core/24 thread E7540, with 160GB RAM, running RHEL 5.5 64 bit. Using compiler version 11.1.072, the elapsed time for the build is around 15 minutes. Using compiler version 12.0.3, the elapsed time is around 80 minutes. Overall CPU consumption is about the same - around 85 CPU minutes.

Via top, I see no more than 2 copies of fortcom when using 12.0.3. I see many more than than when using 11.1. It appears (emphasis on "appears") that ifort is synchronizing on something, and limiting the number of fortcom instances that run concurrently.

Anyone have any ideas what that would be?

Thanks
0 Kudos
5 Replies
jimdempseyatthecove
Honored Contributor III
340 Views

Are the 24 copies running with the same current directory?
If so, see if you can configure your make file to use different current directories.
Note, running in the same directory should not matter if the compiler is written properly. I am assuming that maybe there is an issue with conflict withtemporary file names (IPO and FPP may be creating conflicting temporary file names). The (any) temporary files ought to generate "unique" temporary names.
I recall back in the V8.n.m days in Windows that there was a similar issue with temporary file names. Maybe an old bug crept back into the code base (or finally came to surface).

Jim Dempsey

0 Kudos
TimP
Honored Contributor III
340 Views
I have run into parallel build issues with the current compiler, but I assumed they were connected with increased memory/stack usage. It seems you would need at least 48GB RAM to run 24 copies of the compiler, even with -ipo disabled and careful use of Makefile to limit use of full optimization to where it's useful.
With default configuration, the temporary files go into /tmp, so you need to be careful that you don't run out of disk space there.
0 Kudos
Ron_Green
Moderator
340 Views
another thought is where you get your license - is it a local license file or a floating license? If it's a floating license, try moving the licenses out of /opt/intel/licenses, get an evaluation license, and put that single license in /opt/intel/licenses. See if that helps

cleaning up the license dir: make sure old and beta licenses are moved out of /opt/intel/license on the local server and on any flexlm floating license server.

filesystems: I would assume, but have to ask: the sources are stored on local disk and not NFS or samba shares? remove that variable by copying to local disk and trying the same exercise. And definitely keep sources OFF of Lustre, GFS, PFS, or other parallel file systems - these are notoriously slow for inode updates and small block and random access IO. They're tuned for big sequential stream in/stream out and not normal file operations such as a compiler will do.

ron
0 Kudos
Chris_Payne
Beginner
340 Views
Today the behavior I posted about has gone away, with no change to the build processes. Thanks for the ideas. I'll run these by our Linux admins to see if something changed without my knowledge. The files are on a EMC frame - perhaps that is a clue.

Thanks to all that responded.
0 Kudos
jimdempseyatthecove
Honored Contributor III
340 Views
Another "guess" is the system admin may have had an environment variable (PATH, TMP, TEMP, LIB, etc...) including a referenceto a "slow" device path prior to the path(s) used by IVF. If they fixed the environment variable, or for example removed a CD where the CD was in one of the paths (missing CD may report error faster than installed CD witout file/folder of interest).

Jim Dempsey
0 Kudos
Reply