I am using an hybrid formulation (MPI + OpenMP). When I use MPI with any number of process the computation goes very good and the output subroutine (where I print all my data) goes very well too. I use a collection of gather form my Output/Writing, not very efficient, I know.!!
However, when I use the OpenMP + MPI, the output task of my code stops working with the message (OOM Out of Memory). No matter if I set setnv OMP_NUM_THREAD = 1 the problem persists.
I have also used setnv OMP_NUM_THREAD = 2 and inside the code I have tried changing the number of threads using OMP_SET_NUM_THREAD before starting the Output subroutine but still I did not work.
I was wondering if is there an option or instruction to free the memory from the Threds. I see that even using OMP_SET_NUM_THREADS(1) did not workout.
What do you recommend me.
First, does your program run directly? 1 thread, 2 threads, ...
OpenMP threads tend to default to having a relatively small stack size (1MB to 4MB) though this should not result in OOM. Can you tell us more about your environment?
Targeting 32-bit or 64-bit application. Number of hardware threads per node, number of nodes available, number of OpenMP threads requested, Linker options, etc...
Hi Jim, thanks for your kindness!!!
Ok, each node has 16 cores, sockets per node =2 and cores per socket =8. i am using SLURM and I have not changed the configuration of the default script to build and submit jobs. I could use Tasks invocation controls but I am not sure if it will help, more info: https://slurm.schedmd.com/mc_support.html#srun_ntasks.
The implementation runs perfectly independently of the number of threads, the problem arises in the output section. That is why I wanted to know if there was an instruction to free the memory that threads allocates.
The cluster specification are these: The system consists of one head node for remote login and approximately 4 TeraByte of memory (1012 bytes), 30 TeraBytes of disk space, 6CPU nodes with 32 eight-core Intel processors giving a total of 256 cores plus 2 CPU/GPGPU nodes with a 10 core Intel processor and 4 K40 Tesla GPU accelerators giving approximately 12.4 TFlops* performance, 64 bits.
I tried with setenv OMP_NUM_THREAD 1 and the problem persists. The only way I can go through my processes succesfully is by compiling my case without -openmp flag. BEcuase even with one thread the problem pops up.
Many years ago (maybe 11), I had a simulation program that presented the OOM error condition. After a lengthy investigation of the situation without resolving the problem, I did a little more research into the triggers for this message. One of the causes is when your application runs out of memory (as the error message implies). This is a misnomer because it typically means you ran out of page file capacity. The second reason for the oom service to kill a job is when a significantly long compute section occurs, which appears to the oom service that your application is hung. For a fix, use Google and search for:
oom killer linux
look at either the configuration information or how to create exclusions.
Note, your system admin may have to get involved with this.