Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29280 Discussions

Question about resident memory (RES) of OpenMP codes

guangye_li
Beginner
674 Views
I am running a large OpenMP code compiledbyIntel ifort 10.0 on a Harpertown node. I found out that the RES size (from top -H) grows with the number of threads. For example,on one thread, RES=614m,on four threads, RES=778m for each thread, andon eight threads, RES=835m for each thread. I understand that OpenMP codes use more memory than squential codes do. But why the RES size per thread grows with the number of threads used?
0 Kudos
3 Replies
TimP
Honored Contributor III
674 Views
Quoting - guangye.li
I am running a large OpenMP code compiledbyIntel ifort 10.0 on a Harpertown node. I found out that the RES size (from top -H) grows with the number of threads. For example,on one thread, RES=614m,on four threads, RES=778m for each thread, andon eight threads, RES=835m for each thread. I understand that OpenMP codes use more memory than squential codes do. But why the RES size per thread grows with the number of threads used?
In applications I deal with, most of the growth is in threadprivate data, where each thread gets its own copy.
0 Kudos
guangye_li
Beginner
674 Views
Quoting - tim18
In applications I deal with, most of the growth is in threadprivate data, where each thread gets its own copy.

Yes, each thread gets its own copies for private and threadprivate arrays. So the total memory size should grow at most linearly with the number of threads. But my question is why the per-thread sizes also grow? From the numbers I provided, the total size is superlinear.
0 Kudos
jimdempseyatthecove
Honored Contributor III
674 Views
Quoting - guangye.li
I am running a large OpenMP code compiledbyIntel ifort 10.0 on a Harpertown node. I found out that the RES size (from top -H) grows with the number of threads. For example,on one thread, RES=614m,on four threads, RES=778m for each thread, andon eight threads, RES=835m for each thread. I understand that OpenMP codes use more memory than squential codes do. But why the RES size per thread grows with the number of threads used?

Is your one thread compiled as OpenMP (i.e. with overhead of base omp code), or without?
Comparing from 4 to 8 threads shows a growth of 7.3%/thread. (1.073 x)
Whereas comparing "1" to 4 shows growth of 26.7%/thread.
I think your 1 thread test size value was derived from compiling without OpenMP.

It would appear that adding a thread to your application "leaks" 7.3%. Not necessarily a traditional memory leak but possibly related to PRIVATE/SHARED/REDUCTION(of arrays)/NESTED or other.

Jim Dempsey


0 Kudos
Reply