- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to offload computation to two Xeon Phi's using a code similar to the following -
!$omp parallel do num_threads(16) .... do i = 1,n some computations if (some condition true only once) then offload to phi 1 else if (a different condition true only once) then offload to phi2 end if end do
The above code is executed for several timesteps (with two offloads per timestep). Whatever offloading I have done till now, I saw that only the first offload (to each phi) includes overheads and the subsequent offloads take similar time (for similar regular computation). Earlier, only in the first offload, the details of thread placement were printed (eg, OMP: Info #242: KMP_AFFINITY: pid 55645 thread 1919 bound to OS proc set {240}).
In the above code, I see that most of the offloads (not all) include the above mentioned overheads, and for them, the details of thread placement were printed (ie, most of them acted like startup offloads).
Any hints to why this might be happenning?
Thanks,
Amlesh
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please note that generally, the thread placement message used to be printed only once (in the first offload to each MIC), with a message like - OMP: Info #242: KMP_AFFINITY: pid 55645 thread 239 bound to OS proc set {240}
The maximum thread would go till 239 (ie, 240 threads), but in the above case, the message is not only printed multiple times (ie, for almost all offloads), but also with thread id's till 1919, ie, - OMP: Info #242: KMP_AFFINITY: pid 55645 thread 1919 bound to OS proc set {240}
Can someone please point me towards the most likely sources which can be causing the issue.
Thanks,
Amlesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You say " the thread placement message used to be printed only once" and that now you are seeing different behavior.
What did you change between the two scenarios?
Can you create a minimal test-case that demonstrates the problem?
The offload library will create a new process on each MIC device only once.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That you are seeing thread numbers as high as 1919 hints that somehow you have enabled nested parallelism on MIC and are running four parallel teams of 240 threads simultaneously.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page