- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My context:
- Own made DLL with functions that should be sampled
- Parent application loading the DLL not of interest
Approach: Use ITT API to specify what to sample
Solution:
- #include <ittnotify.h>
- Use __itt_resume() and __itt_pause() to bracket code sections of interest
- No domain, handle etc. setup, keep it dead simple, expect that function names are shown later
Problem: The guarded function now runs endless, much slower than sampling the whole app.
Any idea how this can happen? I would expect that sampling is much faster, and that I later only see the sampled code section in the viewer.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suppose the __itt call might suppress optimization, particularly if it depends on IPO. Then those calls might need to be moved further out in function nest.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim P. wrote:
I suppose the __itt call might suppress optimization, particularly if it depends on IPO. Then those calls might need to be moved further out in function nest.
The __itt* calls are outside any loops etc. The calls guard a function that has a very long run time, which I want to measure & optimize.
BTW: When adding any of the "domain" or "handle" calls, the program crashes. Hence I only use the "resume" and "pause" calls.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you provide source snippet of your ITT API usage? You can leave ITT calls only. Are you only enabling/disabling collection via ITT API or you have your app instrumented with ITT API (e.g. tasks, frames, etc)?
Also, how much time the pair of itt_resume/itt_pause is called?
I also assume you're using "Start Paused" button to start profiling.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is the overall code structure. I commented the domain stuff as it crashes the program. // #include <ittnotify.h> #include "abc.h" /*__itt_domain* domain_qrdll; __itt_string_handle* handle_qrdll; void itt_initialize() { // Create a domain that is visible globally __itt_domain* domain_qrdll = __itt_domain_create("qrdll"); // Create string handles which associates with the "qrdll" task. __itt_string_handle* handle_qrdll = __itt_string_handle_create("qrdll"); } */ void my_dll_function( /* long running computation */ const int n_ind, const real_t x[/* n_ind */] ) { // __itt_task_begin(domain_qrdll, __itt_null, __itt_null, handle_qrdll); // __itt_resume(); for(j = 0; j < n_ind; j++) df= (real_t) 0.0; for(i = 0; i < n_obs; i++) { ri = - (real_t) dep; for(j = 0; j < n_ind; j++) df += ind[i * n_ind + j]; } // __itt_pause(); // __itt_task_end(domain_qrdll); }
I only try to enable / disable collection, nothing fancy required. The function can be called a hundred couple of times. And yes, I start the application in "Start paused" state.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page