- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am pretty new on Intel Xeon Phi coding and would appreciate clarifications, really basic for most of you seniors on this topic.
After ssh-ing to a Xeon Phi (mic0), I ran a "top".
In order to check if my offload test code was running on the 60 cores of a Phi 5110P, I "tapped" "f" from "top" and selected "j" which displays "P" on the top window, indicating the "Last Used cpu (SMP)".
Before running the code I could see current and past activity in several "CPUs" (P = 72, 69, 239, 227, 214, 4, etc.)
Therefore, I suspect that P does not show the cpu (core) IDs, which should be from 0 to 59 but it might show the thread IDs (from 0 to 239). Is that right?
When running my test code, two new processes show under micuser: offload_main and coi_daemon.
Running the code many times, coi_daemon takes "P=237, 238, 239" (the last thread IDs of mic0, I suppose).
However, offload_main, the code I am running, always takes "P=9".
If this is really a thread ID as I suspect, it makes sense since I set up the environment variable for MKL to be single threaded as recommended in the MKL FFT documentation.
My goal is to distribute, close to 60 (say 55) instances of single threaded FFTs in parallel (leaving some others for "management").
If I am able to do that, I assume I will see several instances of off_main, from P=0 through 219 (55x4 threads) for example. Is that right?
Now, in order to achieve this FFT parallelization, I tried to do a "#pragma omp" inside the offload section (#pragma offload) of my code and got an "Offload error: process on the device 0 was terminated by signal 11 (SIGSEGV)".
The same code (including the omp section) runs well in host mode (with the #pragma offload disabled), so, I am wondering if I am allowed to do an omp inside an offload section. I searched blogs on the subject but could not find a single MKL FFT code example of an omp inside an offload section.
Could someone please point me to an answer?
Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When you say "visualize," one would have thought you had in mind micsmc-gui (run on the host side of your installation). In order to see a bar chart of the load on each core (100% for 4 threads/core), select your coprocessor (e.g. mic0) in the upper left box, and open the display from the upper right box.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page