Analyzers
Community support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
4969 Discussions

Sampling API Error: resume sampling collection failed.

slohn
Beginner
444 Views
Hi,

currently I am struggeling with the error as mentioned above.

One of our programmers made the effort to include performance analysis into our framework to perform a hotspotanalysis (VTune Amplifier XE Upd. 8), what is working quite nicely. But when I change the hotspotanalysis to a user defined hw event counting like this:

$AMPLXECMD -collect-with runsa -knob event-config="$HWEVENTS" -start-paused -follow-child \\
-target-duration-type=medium -no-allow-multiple-runs -no-analyze-system \\
-data-limit=500 -slow-frames-threshold=40 -fast-frames-threshold=100 \\
-r=$OUTPUTDIR/r@@@ -- $APPLICATION

Then I recieve the error message. Reducing the command did not show any improvements. The ide is to profile specific algorithms that appear in VTune then as tasks. If an algorithm is started there is a before hook asking for some customized code, where we put:

taskId = __itt_event_create(typeName.c_str(), typeName.size());
__itt_event_start(state.event);

and if started:

__itt_event_end(state.parent_event);

just before the start and so on. Between algorithms the profiling is paused and then resumed. Means it will be called with high frequency. Is this a problem? How could I fix it?

After browsing through the web, I did not found any solution. Has somebody any idea?

Thanks,
Stefan
0 Kudos
3 Replies
Peter_W_Intel
Employee
444 Views
I don't know why did you use pause mode, did you resume sampling in code?

Secondary, event start/end is only marked in timeline report. And it's for user-mode sampling (Hotspots, Concurrency Analysis, LocksAndWaits Analysis), NOT for PMU event-based sampling.

Here I gave you a simple example - matrix1.c
[cpp]#include #include #include #include "ittnotify.h" #define NUM 512 double a[NUM][NUM], b[NUM][NUM], c[NUM][NUM]; __itt_event event_matrix; void multiply() { unsigned int i,j,k; __itt_event_start(event_matrix); for(i=0;i = 0.0; for(k=0;k += a*b; } } } __itt_event_end(event_matrix); } main() { clock_t start, stop; event_matrix = __itt_event_create ("Mark matrix event", 17); //start timing the matrix multiply code start = clock(); multiply(); stop = clock(); // print elapsed time printf("Elapsed time = %lf secondsn", ((double)(stop - start)) / CLOCKS_PER_SEC); } [/cpp]

gcc -g matrix1.c -I/opt/intel/vtune_amplifier_xe_2011/include /opt/intel/vtune_amplifier_xe_2011/lib64/libittnotify.a -lpthread -ldl -o matrix1

# amplxe-cl -collect hotspots -- ./matrix1
Elapsed time = 0.740000 seconds
Using result path `/home/peter/problem_report/r001hs'
Executing actions 75 % Generating a report
Summary
-------

Elapsed Time: 0.761
CPU Time: 0.750
Executing actions 100 % done

Open result from amplxe-gui, note "User Task" mark in timeline report

0 Kudos
slohn
Beginner
444 Views
Thanks,

I think this clarifies why it is not running.

The only point is, that using tasks gives me another possibility to group processing time, right? But if I use frames, I can do the same and user event sampling is covered as well, isn't it? So where is the difference between events and frames?

Thanks,
Stefan
0 Kudos
Peter_W_Intel
Employee
444 Views
Using __itt_frame is another approach when you do same (similar) works in a loop, so all performance dataare classifiedin eachiteration, please see this article.

__itt_event provides APIstomark"event star/end"in timeline report, whereyou runcritical code. Usually use "zoom-in/filter on selection", tofocus on this time range to review result.

Regards, Peter
0 Kudos
Reply