- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
**** EDIT 2 **** Ignore this message. my bad, mistook : for range (-) instead of count
EDIT: Using Windows 10 Pro, Parallel Studio 2019 Update 4, MS VS 2019
On a KNL 7210 the following program generates a runtime error:
program bug_omp_get_place_num_procs
USE lapack95, ONLY: HEEVR
USE f95_precision, ONLY: WP => SP
USE omp_lib
use kernel32
use mkl_service
implicit none
call SETENVQQ("OMP_NUM_THREADS=4")
!KNL call SETENVQQ("OMP_PLACES={0:12},{13,15,17},{100:112},{113:145}")
!i72600K call SETENVQQ("OMP_PLACES={0},{1,2,3},{6,7},{5}")
call SETENVQQ("OMP_PLACES={0:12},{13,15,17},{100:112},{113:145}")
call SETENVQQ("OMP_PPROC_BIND=")
call SETENVQQ("KMP_AFFINITY=")
call SETENVQQ("KMP_HW_SUBSET=")
print *,"kmp_get_affinity_max_proc()",kmp_get_affinity_max_proc()
!$omp parallel
!$omp critical
print *, omp_get_num_threads(), omp_get_thread_num(), OMP_GET_PLACE_NUM_PROCS(omp_get_thread_num())
print *
!$omp end critical
!$omp end parallel
print *,"done"
end program bug_omp_get_place_num_procs
OMP: Warning #123: Ignoring invalid OS proc ID 256.
kmp_get_affinity_max_proc() 256
4 0 12
4 1 3
OMP: Error #114: kmp_set_affinity: invalid mask.
OMP: Error #114: kmp_set_affinity: invalid mask.
On Core i72600K no such error
program bug_omp_get_place_num_procs
USE lapack95, ONLY: HEEVR
USE f95_precision, ONLY: WP => SP
USE omp_lib
use kernel32
use mkl_service
implicit none
call SETENVQQ("OMP_NUM_THREADS=4")
!KNL call SETENVQQ("OMP_PLACES={0:12},{13,15,17},{100:112},{113:145}")
!i72600K call SETENVQQ("OMP_PLACES={0},{1,2,3},{6,7},{5}")
call SETENVQQ("OMP_PLACES={0},{1,2,3},{6,7},{5}")
call SETENVQQ("OMP_PPROC_BIND=")
call SETENVQQ("KMP_AFFINITY=")
call SETENVQQ("KMP_HW_SUBSET=")
print *,"kmp_get_affinity_max_proc()",kmp_get_affinity_max_proc()
!$omp parallel
!$omp critical
print *, omp_get_num_threads(), omp_get_thread_num(), OMP_GET_PLACE_NUM_PROCS(omp_get_thread_num())
print *
!$omp end critical
!$omp end parallel
print *,"done"
end program bug_omp_get_place_num_procs
kmp_get_affinity_max_proc() 8
4 0 1
4 1 3
4 2 2
4 3 1
done
Could you check into this?
Jim Dempsey
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
EDIT:
Further testing... (removing call to call to omp_get_place_num_procs)
The issue is not due to omp_get_num_procs, but rather OMP_PLACES when used on a system with more than one processor group (Windows), and (I assume) where a place resides outside the main thread's processor group.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi jim,
I'm dusting off my KNL systems. I can take a look tomorrow. So do you think this is a bug on KNL?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ronald,
There is no problem. I found my error. See EDIT on first post.
I do have a minor issue on my KNL using Windows 10.
I installed a retail copy of Windows 10 Pro on the KNL
I am experiencing what appears to be significant issues with the keyboard. USB wired, as well as USB Bluetooth. Both have significant delay to respond to key press, as well as about a 25% chance of double striking. It takes several attempts to get logged in. I've been unable to find solutions to this via Googling.
This behavior does not appear when I remote desktop into the system. Which is what I usually do as I have a 4K 28" monitor on the remote system.
If you get around to dusting off your KNL (and install Windows 10), see if you have similar issues. Or know of this issue. I did install the "current" drivers from the Supermicro site (K1SPE motherboard in that special Ninja offering that had years back).
I did not have this problem using Cent OS 7.2 on that same system.
Using Remote Desktop, I do experience some delays (eg. Build of Intel IVF project in MS VS 2019) will hang for 10-20 seconds. Other times it builds right away. I am aware that the CPU is a poky 1.2GHz/core but its pokeyness should be consistent.
Merry Christmas
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ron,
IMHO There is an issue with the Windows implementation of OpenMP on manycore system. This may be documented as a limitation, but this can be corrected, at least I think it can be.
Windows uses the concept of Processor Groups, each group has a limitation of up to 64 hardware thread. For example, the KNL system has 256 HW threads, and Windows creates four Processor Groups (64 threads each).
The problem I encountered appears when using the MKL threaded library from an OpenMP threaded application.
When using OMP_PLACES={...},{...},...
to specify the main program's OpenMP places (each main OpenMP thread is affinitied to one place). For example if the places have 8 threads each, say 16 places, then the 16 MKL instances, each will have the respective 8 threads as given to the main threads.
The problem is, that one cannot specify a place that spans a processor group.
OMP_PLACES={0:128},{128:128}
errors out (place spanning processor group)
While MKL is capable of using multiple processor groups (e.g. no places specified and one main thread), you (we) cannot use the OMP_PLACES to specify a larger area for MKL to use.
It would be permissible for the OpenMP (first level) to choose one of the processor groups the place spans, but leave the full place available for MKL to use.
I hope this makes sense to you.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page