- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I have a code that is being parallelized in 24 processes using OpenMPI. Each process makes use of the MKL library.
If I compile my code using the multi threaded MKL library and set OMP_NUM_THREADS to, say, 4, does that actually mean that each process using MKL will use 4 threads? So I need 24*4 threads?
I have many other questions, but I will start with this one, which is more basic.
Many thanks!
Rafael
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, that's what it means. The OpenMP library does not know that OpenMPI (or any other parallelization library) is also managing streams of execution, so you do run the risk of "oversubscribing" the processor.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, that helps, thanks. That is probably why I was getting a segmentation fault message.
I am submitting a job with qsub. If I want the job to launch 24 processes, each using MKL with 4 threads, how should my PBS file look like?
===============================
#PBS -N Job_Estimation
#PBS -o PBS_out
#PBS -e PBS_error
#PBS -l walltime=00:10:00
#PBS -l nodes=24:ppn=4
setenv OMP_NUM_THREADS 4
mpirun -bynode -np 24 ./execname
================================
Does that look right? I am still getting a segmentation fault message, not sure where I should look into.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can't help you there - my MPI knowledge is quite limited. I'm sure someone else here will know.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you test a single MPI process with 4 threads to see if you set your stacks large enough?
The PBS settings look reasonable, if you're not using a development version of OpenMPI which supports multiple threaded ranks per node (as Intel MPI does).
The PBS settings look reasonable, if you're not using a development version of OpenMPI which supports multiple threaded ranks per node (as Intel MPI does).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Tim.
Yes, with just one process it works. I am using 2011 versions of Intel OpenMPI and Intel Fortran Compiler.I tried setting OMP_NUM_THREADS to 1 directly on my prompt, without using qsub and a PBS file.
It worked launching it from one node, without using qsub:
mpirun -np 24 ./execname
When I tried another node, it gave a segmentation fault error message.
It also kept giving me segmentation fault when I submit with qsub and the PBS file, even with OMP_NUM_THREADS set to 1.
Do you have a hint of what I should be looking at?
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Both the shell stack size (e.g. ulimit -s) and the thread stack size (OMP_STACKSIZE or KMP_STACKSIZE) limits come into play. Did you read the article about diagnosing segfaults which is posted at the top of this forum?
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page