- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
Is there anyway to use vtune under the slurm queuing system? I cross posted w/ the HPC thread as well.
Thanks
Matt
Is there anyway to use vtune under the slurm queuing system? I cross posted w/ the HPC thread as well.
Thanks
Matt
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Matt,
I'm not familiar with Slurm queuing system... Iguess that isa utility for resource management for cluster system.
VTune Amplifier XE can only collect perfomancedata on one node of cluster system, the user may install the product again on other nodes.
If you run VTune with MPI job in a chip, please refer to this article.
Another article for your reference, to install the product on the cluster system.
Regards, Peter
I'm not familiar with Slurm queuing system... Iguess that isa utility for resource management for cluster system.
VTune Amplifier XE can only collect perfomancedata on one node of cluster system, the user may install the product again on other nodes.
If you run VTune with MPI job in a chip, please refer to this article.
Another article for your reference, to install the product on the cluster system.
Regards, Peter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I read that article. It was informative, now I know why it isn't working. Basically it say
Usercan also viewresults viaGUI by using command "amplxe-gui".You will findonly process "python" was displayed, for example:
Root-cause:
mpiexec doesn't run MPI program directly, it run connection to MPI's mpd daemon via socket and pass all parameters, so the program is not child process of mpiexec.
So, I now don't know what to do.
I tried
srun -n 1 amplxe-cl -V
and that works
Then I tried
srun -n 1 amplxe-cl -collect hotspots -r r0002hs -- mycode
and mycode doesn't run. The reason I think is that parameters are not getting passed to my code which srun creates.
Any other ideas?
Usercan also viewresults viaGUI by using command "amplxe-gui".You will findonly process "python" was displayed, for example:
Root-cause:
mpiexec doesn't run MPI program directly, it run connection to MPI's mpd daemon via socket and pass all parameters, so the program is not child process of mpiexec.
So, I now don't know what to do.
I tried
srun -n 1 amplxe-cl -V
and that works
Then I tried
srun -n 1 amplxe-cl -collect hotspots -r r0002hs -- mycode
and mycode doesn't run. The reason I think is that parameters are not getting passed to my code which srun creates.
Any other ideas?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried the following in my code.
int pid = getpid();
sprintf(cmd,"amplxe -collect hotspot -target-pid %d", pid)
system(cmd);
and I get an error, this analysis doesn't support system wide profiling or attaching to a process.
but the examples in the doc's on intels site suggests that it does.
from
http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/lin/ug_docs/olh/cli_ref/target-pid.html#target-pid
Thanks
int pid = getpid();
sprintf(cmd,"amplxe -collect hotspot -target-pid %d", pid)
system(cmd);
and I get an error, this analysis doesn't support system wide profiling or attaching to a process.
but the examples in the doc's on intels site suggests that it does.
from
http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/lin/ug_docs/olh/cli_ref/target-pid.html#target-pid
$ amplxe-cl -collect hotspots -target-pid 1234
I am using version 11 Update 1Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Final comment for a while. It looks like one can use lightweight-hotspots collection, however, this requires the kernel modules to be installed which, of course, they are not. I am guessing all the target-pid collectors require this module. I have put in a help ticket on our big iron system, however, they tend not to address these sorts of issues.
Matt
Matt
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The attach-to-process (-target-pid) functionality on Linux was added in Update 3. Also, the command to perform the attach doesn't return until the target process finishes - you would need to use fork/exec rather than 'system' to perform a self-attach.
When running the original command you listed ('srun -n 1 amplxe-cl -collect hotspots -r r002hs myapp') - did it produce the result directory? If so, look in the data.0 subdirectory for error messages. If present, that may give some indication why your application is failing to run.
Note that the issue may not be just running under Slurm, but some issue with any of the technologies involved. (Slurm -> MPI -> Linux node )
Mark
When running the original command you listed ('srun -n 1 amplxe-cl -collect hotspots -r r002hs myapp') - did it produce the result directory? If so, look in the data.0 subdirectory for error messages. If present, that may give some indication why your application is failing to run.
Note that the issue may not be just running under Slurm, but some issue with any of the technologies involved. (Slurm -> MPI -> Linux node )
Mark
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page