Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)

Safe way to terminate Vtune analysis with MIC?

Pramod_K_
Beginner
1,212 Views

Dear All,

I am using Vtune(2015.1.1.380310) to run bandwidth analysis on MIC. Sometime I end up with large profile data and then Vtune takes long time. I have two questions: 

  • today I run Vtune with 120 threads on single MIC card for simulation time of 160 seconds. After end of program, I see no progress / messages from Vtune in three hours. Is this normal? (or Vtune hang?) I can see the ample-runss and ample-python processes running on host. How to check if it is hang or not?
  • this is important: If I try to kill / suspend /cacnel the Vtune analysis with Ctl^C/killing process then I have to restart the sep server. I have to ask sysadmins to do this all the time.

Are these know issues? Any workaround to safely cancel the vtune analysis with MIC? (I am familiar with pause-resume api and other ways to reduce the size of profile collection) 

 

0 Kudos
9 Replies
David_A_Intel1
Employee
1,212 Views

Hi Pramod:

Are you using the command line or GUI to start the collection?  If GUI, try pressing "Cancel" button.  If command line, try 'amplxe-cl -C cancel -r <resultsdir>'.  Canceling will lose results, but may gracefully terminate.  You could also try "stop' command.

I don't think three hours without any progress is normal.  Are you writing your project and result files to a local drive or an NFS drive?  Performance of the NFS drive can impact results finalization time.

0 Kudos
Pramod_K_
Beginner
1,211 Views

Thanks for quick response! I am writing to local drive which is sufficiently fast. Actually, the analysis with 60 threads finishes quickly with 300MB of profile data but 120 threads is stuck. 

I am using command line. I tried your command but got following error:

amplxe-cl -C cancel -r vtune_paper_hpcopt_120t_120f_bandwidth/

amplxe: Fatal error: The specified result directory "...path../vtune_paper_hpcopt_120t_120f_bandwidth" does not provide a path to the running collection. Please specify a valid path.

I see that the Vtune is still running with following result directory:

I see running processes

$ ps -aux | grep amplx

kumbhar  22846  0.0  0.0 3590036 40832 pts/4   Sl+  14:49   0:00 amplxe-cl -data-limit=0 -resume-after=120 -collect bandwidth -r vtune_paper_hpcopt_120t_120f_bandwidth --target-system=mic-native:0 --search-dir=. -- KMP_AFFINITY=verbose,balanced KMP_PLACE_THREADS=60c,2t OMP_NUM_THREADS=120 LD_LIBRARY_PATH=/opt/..../lib /opt/intel/impi/4.1.2.040/mic/bin/mpiexec.hydra -n 1 hpcopt.prof-none.mic_linux -e 2

kumbhar  22863  0.1  0.0 1292072 23100 pts/4   Sl+  14:49   0:24 /opt/intel/vtune_amplifier_xe_2015.1.1.380310/bin64/amplxe-python /opt/intel/vtune_amplifier_xe_2015.1.1.380310/bin64/amplxe-runss.py --target-system=mic-native:0 --no-modules --log-folder=/tmp/amplxe-log-kumbhar/2015-04-13-Mon-14-49-41-446011.amplxe-cl/ --ui-output-format xml --option-file /home/k......./vtune_paper_hpcopt_120t_120f_bandwidth/config/runsa.options

kumbhar  22867  0.0  0.0 11173328 31724 pts/4  Sl+  14:49   0:15 /opt/intel/vtune_amplifier_xe_2015.1.1.380310/bin64/../bin64/amplxe-runss --ui-output-format xml --result-dir /home/kumbhar/workarena/systems/jknc/repos/bbp/coreneuron/paper/results/vtune_manual/vtune_paper_hpcopt_120t_120f_bandwidth --option-file /home/kum............/vtune_paper_hpcopt_120t_120f_bandwidth/config/runsa.options

 

Any suggestion?

0 Kudos
Pramod_K_
Beginner
1,211 Views

the result directory is:

$ ls vtune_paper_hpcopt_120t_120f_bandwidth/
config  data.0  runtool.22867.ipc  sqlite-db  vtune_paper_hpcopt_120t_120f_bandwidth.amplxe

 

0 Kudos
TimP
Honored Contributor III
1,212 Views

Looks like you should reduce sampling rate when increasing number of threads, either by increasing sample after values or  expected run time in advanced section of GUI menu.  

I too have had to ring up the sysadmin after hanging  vtune.

0 Kudos
David_A_Intel1
Employee
1,212 Views

The command line also supports the "estimated duration".  See '-target-duration-type' and possible values.  Default is 'short', meaning, one to 15 minutes.

-target-duration-type=veryshort | short | medium | long

Pramod said he already knew how to "limit data collection", so I focused on his request to "stop data collection."

Ah!  But, I see Pramod has *disabled* the data limit ('-data-limit=0')!!   Pramod, this is highly discouraged exactly for the reason you are experiencing!! You can try *raising* the limit, but if you remove the limit and collect a lot of data, bad things will happen! :(

What happens if you don't remove the limit (i.e., let it default to the 500 MB limit) and profile your app?  Does it work?  Does it not collect data for the entire run?  What "elapsed time" is reported by VTune Amplifier?  Do you *need* to profile the entire run, or is there initialization processing that can be skipped?  You can control when data collection starts from the GUI or with command-line options, as well as the API, which you already mentioned.

0 Kudos
Pramod_K_
Beginner
1,211 Views

thanks again for all info! Our application has initialisation phase which was taking 140 seconds and then solver phase of 10 seconds. Initially I wasn't able to see the solver in the profile. So I put data data-limit=0 (i.e. unlimited) and -resume-after=120. This is where I made a mistake! I wanted resume after 120 seconds and not 120 milliseconds

I could easily try above suggested options or add pause/resume api but the problem is I can't kill the Vtune due to aforementioned reason: I have to wait for sysadmins tomorrow to restart the sep server if I kill Vtune  analysis :)

Has vtune collected too much data for the above run and thats the reason it's slow / not responding? not sure though as I see few MBs in the result directory after 7 hours:

$ du -h vtune_paper_hpcopt_120t_120f_bandwidth
1.6M	vtune_paper_hpcopt_120t_120f_bandwidth/sqlite-db
524K	vtune_paper_hpcopt_120t_120f_bandwidth/data.0
68K	vtune_paper_hpcopt_120t_120f_bandwidth/config
2.2M	vtune_paper_hpcopt_120t_120f_bandwidth

In short, it will be great to have clean way to terminate the Vtune analysis.

Thank you for all your quick help!

0 Kudos
TimP
Honored Contributor III
1,212 Views
In principle you can start paused and resume after 120 seconds but if you need to resume at a repeatable point adding the vtune API call is better. As Mr a said you should limit collection data set to avoid hang.
0 Kudos
Pramod_K_
Beginner
1,212 Views

Thank you all! today I removed -data-limit=0  and added -resume-after to skip first 140 seconds initialisation phase. I was able to collect the profiling results with 120 and 240 threads. 

0 Kudos
Peter_W_Intel
Employee
1,212 Views

Tim Prince wrote:

In principle you can start paused and resume after 120 seconds but if you need to resume at a repeatable point adding the vtune API call is better. As Mr a said you should limit collection data set to avoid hang.

Tim is right! Using VTune Pause/Resume API is precise to collect data that you want.

Another approach is to start up collection with pause mode (“-start-paused”), in first console; then open second console to do "amplxe-cl -command resume -r r000?" - you need to specify right vtune result generated in first console. The benefit is that you don't need to insert VTune APIs in code.

0 Kudos
Reply