- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
currently I am struggeling with the error as mentioned above.
One of our programmers made the effort to include performance analysis into our framework to perform a hotspotanalysis (VTune Amplifier XE Upd. 8), what is working quite nicely. But when I change the hotspotanalysis to a user defined hw event counting like this:
$AMPLXECMD -collect-with runsa -knob event-config="$HWEVENTS" -start-paused -follow-child \\
-target-duration-type=medium -no-allow-multiple-runs -no-analyze-system \\
-data-limit=500 -slow-frames-threshold=40 -fast-frames-threshold=100 \\
-r=$OUTPUTDIR/r@@@ -- $APPLICATION
Then I recieve the error message. Reducing the command did not show any improvements. The ide is to profile specific algorithms that appear in VTune then as tasks. If an algorithm is started there is a before hook asking for some customized code, where we put:
taskId = __itt_event_create(typeName.c_str(), typeName.size());
__itt_event_start(state.event);
and if started:
__itt_event_end(state.parent_event);
just before the start and so on. Between algorithms the profiling is paused and then resumed. Means it will be called with high frequency. Is this a problem? How could I fix it?
After browsing through the web, I did not found any solution. Has somebody any idea?
Thanks,
Stefan
currently I am struggeling with the error as mentioned above.
One of our programmers made the effort to include performance analysis into our framework to perform a hotspotanalysis (VTune Amplifier XE Upd. 8), what is working quite nicely. But when I change the hotspotanalysis to a user defined hw event counting like this:
$AMPLXECMD -collect-with runsa -knob event-config="$HWEVENTS" -start-paused -follow-child \\
-target-duration-type=medium -no-allow-multiple-runs -no-analyze-system \\
-data-limit=500 -slow-frames-threshold=40 -fast-frames-threshold=100 \\
-r=$OUTPUTDIR/r@@@ -- $APPLICATION
Then I recieve the error message. Reducing the command did not show any improvements. The ide is to profile specific algorithms that appear in VTune then as tasks. If an algorithm is started there is a before hook asking for some customized code, where we put:
taskId = __itt_event_create(typeName.c_str(), typeName.size());
__itt_event_start(state.event);
and if started:
__itt_event_end(state.parent_event);
just before the start and so on. Between algorithms the profiling is paused and then resumed. Means it will be called with high frequency. Is this a problem? How could I fix it?
After browsing through the web, I did not found any solution. Has somebody any idea?
Thanks,
Stefan
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't know why did you use pause mode, did you resume sampling in code?
Secondary, event start/end is only marked in timeline report. And it's for user-mode sampling (Hotspots, Concurrency Analysis, LocksAndWaits Analysis), NOT for PMU event-based sampling.
Here I gave you a simple example - matrix1.c
[cpp]#include
#include
#include
#include "ittnotify.h"
#define NUM 512
double a[NUM][NUM], b[NUM][NUM], c[NUM][NUM];
__itt_event event_matrix;
void multiply()
{
unsigned int i,j,k;
__itt_event_start(event_matrix);
for(i=0;i = 0.0;
for(k=0;k += a*b;
}
}
}
__itt_event_end(event_matrix);
}
main()
{
clock_t start, stop;
event_matrix = __itt_event_create ("Mark matrix event", 17);
//start timing the matrix multiply code
start = clock();
multiply();
stop = clock();
// print elapsed time
printf("Elapsed time = %lf secondsn",
((double)(stop - start)) / CLOCKS_PER_SEC);
}
[/cpp]
gcc -g matrix1.c -I/opt/intel/vtune_amplifier_xe_2011/include /opt/intel/vtune_amplifier_xe_2011/lib64/libittnotify.a -lpthread -ldl -o matrix1
Secondary, event start/end is only marked in timeline report. And it's for user-mode sampling (Hotspots, Concurrency Analysis, LocksAndWaits Analysis), NOT for PMU event-based sampling.
Here I gave you a simple example - matrix1.c
[cpp]#include
gcc -g matrix1.c -I/opt/intel/vtune_amplifier_xe_2011/include /opt/intel/vtune_amplifier_xe_2011/lib64/libittnotify.a -lpthread -ldl -o matrix1
# amplxe-cl -collect hotspots -- ./matrix1
Elapsed time = 0.740000 seconds
Using result path `/home/peter/problem_report/r001hs'
Executing actions 75 % Generating a report
Summary
-------
Elapsed Time: 0.761
CPU Time: 0.750
Executing actions 100 % done
Open result from amplxe-gui, note "User Task" mark in timeline report
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks,
I think this clarifies why it is not running.
The only point is, that using tasks gives me another possibility to group processing time, right? But if I use frames, I can do the same and user event sampling is covered as well, isn't it? So where is the difference between events and frames?
Thanks,
Stefan
I think this clarifies why it is not running.
The only point is, that using tasks gives me another possibility to group processing time, right? But if I use frames, I can do the same and user event sampling is covered as well, isn't it? So where is the difference between events and frames?
Thanks,
Stefan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using __itt_frame is another approach when you do same (similar) works in a loop, so all performance dataare classifiedin eachiteration, please see this article.
__itt_event provides APIstomark"event star/end"in timeline report, whereyou runcritical code. Usually use "zoom-in/filter on selection", tofocus on this time range to review result.
Regards, Peter
__itt_event provides APIstomark"event star/end"in timeline report, whereyou runcritical code. Usually use "zoom-in/filter on selection", tofocus on this time range to review result.
Regards, Peter

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page