Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29235 Discussions

Question about resident memory (RES) of OpenMP codes

guangye_li
Beginner
640 Views
I am running a large OpenMP code compiledbyIntel ifort 10.0 on a Harpertown node. I found out that the RES size (from top -H) grows with the number of threads. For example,on one thread, RES=614m,on four threads, RES=778m for each thread, andon eight threads, RES=835m for each thread. I understand that OpenMP codes use more memory than squential codes do. But why the RES size per thread grows with the number of threads used?
0 Kudos
3 Replies
TimP
Honored Contributor III
640 Views
Quoting - guangye.li
I am running a large OpenMP code compiledbyIntel ifort 10.0 on a Harpertown node. I found out that the RES size (from top -H) grows with the number of threads. For example,on one thread, RES=614m,on four threads, RES=778m for each thread, andon eight threads, RES=835m for each thread. I understand that OpenMP codes use more memory than squential codes do. But why the RES size per thread grows with the number of threads used?
In applications I deal with, most of the growth is in threadprivate data, where each thread gets its own copy.
0 Kudos
guangye_li
Beginner
640 Views
Quoting - tim18
In applications I deal with, most of the growth is in threadprivate data, where each thread gets its own copy.

Yes, each thread gets its own copies for private and threadprivate arrays. So the total memory size should grow at most linearly with the number of threads. But my question is why the per-thread sizes also grow? From the numbers I provided, the total size is superlinear.
0 Kudos
jimdempseyatthecove
Honored Contributor III
640 Views
Quoting - guangye.li
I am running a large OpenMP code compiledbyIntel ifort 10.0 on a Harpertown node. I found out that the RES size (from top -H) grows with the number of threads. For example,on one thread, RES=614m,on four threads, RES=778m for each thread, andon eight threads, RES=835m for each thread. I understand that OpenMP codes use more memory than squential codes do. But why the RES size per thread grows with the number of threads used?

Is your one thread compiled as OpenMP (i.e. with overhead of base omp code), or without?
Comparing from 4 to 8 threads shows a growth of 7.3%/thread. (1.073 x)
Whereas comparing "1" to 4 shows growth of 26.7%/thread.
I think your 1 thread test size value was derived from compiling without OpenMP.

It would appear that adding a thread to your application "leaks" 7.3%. Not necessarily a traditional memory leak but possibly related to PRIVATE/SHARED/REDUCTION(of arrays)/NESTED or other.

Jim Dempsey


0 Kudos
Reply