Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

openmp code : less processor than the available

vahid_a_1
Beginner
420 Views

Dear All,

I am running my Fortran 90 code in super-computing facility of our university. The max. available processor is 21, however my code is only uses 5 of them. I am using export_OMP_NUM_THREADS=20 however I dont define the stacksize after the compilation. 

Would you please let me know what should I investigate to understand the problem? The below information shows that the max. number of available threads are 21 and my code only used 5 of them.

Resource usage summary:

    CPU time :                                   2471026.00 sec.
    Max Memory :                                 351 MB
    Average Memory :                             74.98 MB
    Total Requested Memory :                     2560.00 MB
    Delta Memory :                               2209.00 MB
    Max Processes :                              5
    Max Threads :                                21

 

 

 

 

 

0 Kudos
5 Replies
jimdempseyatthecove
Honored Contributor III
420 Views

Your system administrator may have control over limiting the maximum number of threads any one process can use.

And for any one process the OMP_NUM_THREADS can be constricted by OMP_MAX_THREADS.

The report (my guess) indicates you may be running an MPI job with 5 processes (ranks), each process/rank using 21 threads. (5 * 21 threads in use).

TimP do you have any comments on this?

Jim Dempsey

0 Kudos
vahid_a_1
Beginner
420 Views

Dear Jim, This is an openmp code not the MPI. I guess there should be difference. The administrator tells me that I can use 20 with openmp code. However as I mentioned the code only uses 5 as the report shows. The job file that I use to submit my code is: 

#BSUB -J pfm
#BSUB -o pfm.o%J
#BSUB -e pfm.e%J
#BSUB -n 20
#BSUB -R "span[ptile=20]"
#BSUB -M 128
#BSUB -R 'rusage[mem=128]'
#BSUB -W 96:00
#BSUB -L /bin/bash
#BSUB -u 
#### queue : ada
#
module load intel/2015A
##
ifort -openmp -o a.out huh014_thienBC.f90
##
export OMP_NUM_THREADS=20
##
export FORT_FMT_RECL=215
##
./a.out

jimdempseyatthecove wrote:

Your system administrator may have control over limiting the maximum number of threads any one process can use.

And for any one process the OMP_NUM_THREADS can be constricted by OMP_MAX_THREADS.

The report (my guess) indicates you may be running an MPI job with 5 processes (ranks), each process/rank using 21 threads. (5 * 21 threads in use).

TimP do you have any comments on this?

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
420 Views

What does the following show?

CHARACER(LEN=100) :: VAL

!$OMP PARALLEL
WRITE(*,*) OMP_GET_THEAD_NUM()
!$OMP END PARALLEL
WRITE(*,*) "max threads", OMP_GET_MAX_THREADS()
WRITE(*,*) "num procs", OMP_GET_NUM_PROCS()
CALL GET_ENVIRONMENT_VARIABLE("OMP_THREAD_LIMIT",VAL)
WRITE(*,*) "OMP_THREAD_LIMIT",VAL
CALL GET_ENVIRONMENT_VARIABLE("KMP_ALL_THREADS",VAL)
WRITE(*,*) "KMP_ALL_THREADS",VAL

Jim Dempsey

0 Kudos
vahid_a_1
Beginner
420 Views

 

The answer is: 

max threads                    1

num procs                       20

OMP_THREAD_LIMIT      'blank'

KMP_ALL_THREADS       'blank'

 

jimdempseyatthecove wrote:

What does the following show?

CHARACER(LEN=100) :: VAL

!$OMP PARALLEL
WRITE(*,*) OMP_GET_THEAD_NUM()
!$OMP END PARALLEL
WRITE(*,*) "max threads", OMP_GET_MAX_THREADS()
WRITE(*,*) "num procs", OMP_GET_NUM_PROCS()
CALL GET_ENVIRONMENT_VARIABLE("OMP_THREAD_LIMIT",VAL)
WRITE(*,*) "OMP_THREAD_LIMIT",VAL
CALL GET_ENVIRONMENT_VARIABLE("KMP_ALL_THREADS",VAL)
WRITE(*,*) "KMP_ALL_THREADS",VAL

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
420 Views

I made a few edits.

Before running, set the environment variable

KMP_SETTINGS=true

This will display the OpenMP related environment variables prior to start of the main program.

Program

program NumThreads
    use omp_lib
    implicit none
    CHARACTER(LEN=100) :: VAL
    !$OMP PARALLEL
    !$OMP MASTER
    WRITE(*,*) "num threads",OMP_GET_NUM_THREADS()
    !$OMP END MASTER
    !$OMP BARRIER
    WRITE(*,*) OMP_GET_THREAD_NUM()
    !$OMP END PARALLEL
    WRITE(*,*) "max threads", OMP_GET_MAX_THREADS()
    WRITE(*,*) "num procs", OMP_GET_NUM_PROCS()
    CALL GET_ENVIRONMENT_VARIABLE("OMP_THREAD_LIMIT",VAL)
    WRITE(*,*) "OMP_THREAD_LIMIT",VAL
    CALL GET_ENVIRONMENT_VARIABLE("KMP_ALL_THREADS",VAL)
    WRITE(*,*) "KMP_ALL_THREADS",VAL
end program NumThreads

Results on my system

User settings:

   KMP_SETTINGS=true

Effective settings:

   KMP_ABORT_DELAY=0
   KMP_ABORT_IF_NO_IRML=false
   KMP_ADAPTIVE_LOCK_PROPS='1,1024'
   KMP_ALIGN_ALLOC=64
   KMP_ALL_THREADPRIVATE=128
   KMP_ALL_THREADS=32768
   KMP_ASAT_DEC=1
   KMP_ASAT_FAVOR=0
   KMP_ASAT_INC=4
   KMP_ASAT_INTERVAL=5
   KMP_ASAT_TRIGGER=5000
   KMP_ATOMIC_MODE=1
   KMP_BLOCKTIME=200
   KMP_CPUINFO_FILE: value is not defined
   KMP_DETERMINISTIC_REDUCTION=false
   KMP_DUPLICATE_LIB_OK=false
   KMP_FORCE_REDUCTION: value is not defined
   KMP_FOREIGN_THREADS_THREADPRIVATE=true
   KMP_FORKJOIN_BARRIER='2,2'
   KMP_FORKJOIN_BARRIER_PATTERN='linear,linear'
   KMP_FORKJOIN_FRAMES=true
   KMP_FORKJOIN_FRAMES_MODE=3
   KMP_GTID_MODE=2
   KMP_HANDLE_SIGNALS=false
   KMP_HOT_TEAMS_MAX_LEVEL=1
   KMP_HOT_TEAMS_MODE=0
   KMP_INIT_AT_FORK=true
   KMP_INIT_WAIT=2048
   KMP_ITT_PREPARE_DELAY=0
   KMP_LIBRARY=throughput
   KMP_LOCK_KIND=queuing
   KMP_MALLOC_POOL_INCR=1M
   KMP_MONITOR_STACKSIZE: value is not defined
   KMP_NEXT_WAIT=1024
   KMP_NUM_LOCKS_IN_BLOCK=1
   KMP_PLAIN_BARRIER='2,2'
   KMP_PLAIN_BARRIER_PATTERN='linear,linear'
   KMP_REDUCTION_BARRIER='1,1'
   KMP_REDUCTION_BARRIER_PATTERN='hyper,hyper'
   KMP_SCHEDULE='static,balanced;guided,iterative'
   KMP_SETTINGS=true
   KMP_STACKOFFSET=64
   KMP_STACKPAD=0
   KMP_STACKSIZE=2M
   KMP_STORAGE_MAP=false
   KMP_TASKING=2
   KMP_TASK_STEALING_CONSTRAINT=1
   KMP_USE_IRML=false
   KMP_VERSION=false
   KMP_WARNINGS=true
   OMP_CANCELLATION=false
   OMP_DISPLAY_ENV=false
   OMP_DYNAMIC=false
   OMP_MAX_ACTIVE_LEVELS=2147483647
   OMP_NESTED=false
   OMP_NUM_THREADS: value is not defined
   OMP_PLACES: value is not defined
   OMP_PROC_BIND='false'
   OMP_SCHEDULE='static'
   OMP_STACKSIZE=2M
   OMP_THREAD_LIMIT=32768
   OMP_WAIT_POLICY=PASSIVE
   KMP_AFFINITY='noverbose,warnings,respect,granularity=core,duplicates,none'

 num threads           8
           0
           6
           2
           7
           1
           3
           4
           5
 max threads           8
 num procs           8
 OMP_THREAD_LIMIT



 KMP_ALL_THREADS



Press any key to continue . . .

Jim Dempsey

0 Kudos
Reply