Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5164 Discussions

Seems ITT task event in user application could not be tracking in "profile system" mode

yidu
Novice
4,402 Views

Hi,

I'm trying to optimize my application's performance on a FreeBSD system. I've installed VTune 2024.2.0 on my Ubuntu laptop and I'm using remote SSH to analyze the performance. I successfully performed hotspot analysis using hardware event-based sampling.

I created a sample application with ITT API enabled, deployed it to a remote FreeBSD machine, and added the INTEL_LIBITTNOTIFY64 environment variable in the startup script. When I use the "Launch Application" feature with this startup script, everything works fine, and I can see the user task events in the results. However, when I use the "Profile System" feature to instrument and manually run the same script, I can't see these user events at all.

Is this behavior by design, or is there something wrong with my environment?

Labels (1)
1 Solution
yuzhang3_intel
Moderator
2,947 Views

Please ignore the #4. I hope the ITT on FreeBSD support can be helpful to you.

View solution in original post

0 Kudos
39 Replies
yuzhang3_intel
Moderator
2,633 Views

Please post your VTune command line, thanks

0 Kudos
yidu
Novice
2,617 Views

Hi @yuzhang3_intel , I run my script manually by ssh connection at sample time. this is my vtune command line.

/home/yidu/intel/oneapi/vtune/2024.2/bin64/vtune -target-system ssh:root@10.151.122.141 -target-install-dir=/home/yidu/tmp/ -target-tmp-dir=/home/yidu/tmp -collect hotspots -knob sampling-mode=hw -knob sampling-interval=6 -finalization-mode=full --duration unlimited

0 Kudos
yuzhang3_intel
Moderator
2,612 Views

Ok. 

Can you tell me which ITT API you used? Event API, right?

0 Kudos
yidu
Novice
2,604 Views

some thing just like sample code provide by vtune.

// Forward declaration of a thread function.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <pthread.h>
#include <unistd.h>

#include <ittnotify.h>
__itt_domain* domain = __itt_domain_create("Example.Domain.Global");
__itt_string_handle* handle_main = __itt_string_handle_create("main");
__itt_string_handle* handle_createthread = __itt_string_handle_create("CreateThread");

void* workerthread(void*);
bool g_done = false;

int main(int argc, char* argv[])
{
// Create a task associated with the "main" routine.
//
if (sizeof(void*) > 4) {
char * lib_name_64 = getenv("INTEL_LIBITTNOTIFY64");
if (lib_name_64)
printf("go env INTEL_LIBITTNOTIFY64: %s \n", lib_name_64);
else
printf("do not found env INTEL_LIBITTNOTIFY64 \n");

} else {
char * lib_name_32 = getenv("INTEL_LIBITTNOTIFY32");
if (lib_name_32)
printf("go env INTEL_LIBITTNOTIFY32: %s \n", lib_name_32);
else
printf("do not found env INTEL_LIBITTNOTIFY32 \n");
}


__itt_task_begin(domain, __itt_null, __itt_null, handle_main);
printf("Main task begins\n");

int av = __itt_collection_state();
printf("collector state %d\n", av);

// Now we'll create 4 worker threads
pthread_t threads[4];
for (int i = 0; i < 4; i++)
{

__itt_task_begin(domain, __itt_null, __itt_null, handle_createthread);
printf("Creating worker thread %d\n", i);
int ret = pthread_create(&threads[i], NULL, workerthread, (void*)i);
if (ret != 0)
{
printf("Failed to create worker thread %d\n", i);
exit(1);
}
__itt_task_end(domain);
}

// Wait a while,...
sleep(30);
g_done = true;
__itt_task_end(domain);

// Mark the end of the main task
printf("Main task ends\n");
return 1;
}

__itt_string_handle* handle_work = __itt_string_handle_create("work");

void* workerthread(void* data)
{
// Set the name of this thread so it shows up in the UI as something meaningful
char threadname[32];
sprintf(threadname, "Worker Thread %p", data);
printf("%s begins\n", threadname);

// Each worker thread does some number of "work" tasks
while (!g_done)
{
__itt_task_begin(domain, __itt_null, __itt_null, handle_work);
printf("%s is working\n", threadname);
usleep(150000);
__itt_task_end(domain);
}

printf("%s ends\n", threadname);
return NULL;
}

0 Kudos
yuzhang3_intel
Moderator
2,553 Views

So your concern is that no Task shown below from the VTune result when profiling the system, right?

yuzhang3_intel_0-1721285732848.png

 

0 Kudos
yidu
Novice
2,537 Views

Yes, it is. in my environment, this result could be got, only if i use "launch application" options.

0 Kudos
yuzhang3_intel
Moderator
2,534 Views

Ok, let me check internally and then give you an update.

0 Kudos
yuzhang3_intel
Moderator
2,476 Views

After confirmation, the VTune ITT does not support profile system mode but should support launch/attach modes.

0 Kudos
yidu
Novice
2,470 Views

OK, Seems this is by design.

My Application could not running directly by ssh command due to it base on GTK and other resource in my system.

Don't know if there have other experimental solution in next version will help me on this?

Thanks.

0 Kudos
yuzhang3_intel
Moderator
2,444 Views

Got it.

You can profile one process on the target device using VTune remote access mode, even though you can't directly launch your application using SSH.  I am not sure if it is useful to profile process for your application performance analysis.

0 Kudos
yidu
Novice
2,436 Views

I just try what's your suggest ways. run my application first then attach the process id for single process profile. but still could not found the task event from result.  below is my vtune command line. please double check what is going on.

/home/yidu/intel/oneapi/vtune/2024.2/bin64/vtune -target-system ssh:root@10.151.122.141 -target-install-dir=/home/yidu/tmp/ -target-tmp-dir=/home/yidu/tmp -collect hotspots-0 --target-pid 96100

0 Kudos
yuzhang3_intel
Moderator
2,408 Views

It may be that when VTune is attached to the running process, the __itt_xxxx() function has already been called, and VTune cannot capture the profiled task because the collection start time happened later. You need to guarantee the __itt_xxxxx() function is called after VTune attaches the process.

0 Kudos
yidu
Novice
2,372 Views

in my scenario, these code running should beyond the collection time.

 

__itt_domain* domain = __itt_domain_create("Example.Domain.Global");
__itt_string_handle* handle_main = __itt_string_handle_create("main");
__itt_string_handle* handle_createthread = __itt_string_handle_create("CreateThread");

__itt_string_handle* handle_work = __itt_string_handle_create("work");

 

this code must have a change running with in collection time.

__itt_task_begin(domain, __itt_null, __itt_null, handle_work);
printf("%s is working\n", threadname);
usleep(150000);
__itt_task_end(domain);

 

Sure I will figure out how to make sure whole code could running under collection time. and feedback the reslults. 

thanks @yuzhang3_intel 

 

0 Kudos
yidu
Novice
2,252 Views

Some modify in my code. use getkey to pause if have additional arbitrary arguments.

same code for launch and attach by id is different. result have task event in launch application but none in attached mode.

help me on this issues please.

// Forward declaration of a thread function.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <pthread.h>
#include <unistd.h>

#include <ittnotify.h>
__itt_domain* domain = NULL;
__itt_string_handle* handle_main = NULL;
__itt_string_handle* handle_createthread = NULL;

void* workerthread(void*);
bool g_done = false;

void init_itt(void) {
domain = __itt_domain_create("Example.Domain.Global");
handle_main = __itt_string_handle_create("main");
handle_createthread = __itt_string_handle_create("CreateThread");
}

int main(int argc, char* argv[])
{
if (argc > 1) {
printf("Any key press here \n");
getchar();
}

init_itt();

if (sizeof(void*) > 4) {
char * lib_name_64 = getenv("INTEL_LIBITTNOTIFY64");
if (lib_name_64)
printf("go env INTEL_LIBITTNOTIFY64: %s \n", lib_name_64);
else
printf("do not found env INTEL_LIBITTNOTIFY64 \n");

} else {
char * lib_name_32 = getenv("INTEL_LIBITTNOTIFY32");
if (lib_name_32)
printf("go env INTEL_LIBITTNOTIFY32: %s \n", lib_name_32);
else
printf("do not found env INTEL_LIBITTNOTIFY32 \n");
}

__itt_task_begin(domain, __itt_null, __itt_null, handle_main);
printf("Main task begins\n");

int av = __itt_collection_state();
printf("collector state %d\n", av);

// Now we'll create 4 worker threads
pthread_t threads[4];
for (int i = 0; i < 4; i++)
{

__itt_task_begin(domain, __itt_null, __itt_null, handle_createthread);
printf("Creating worker thread %d\n", i);
int ret = pthread_create(&threads[i], NULL, workerthread, (void*)i);
if (ret != 0)
{
printf("Failed to create worker thread %d\n", i);
exit(1);
}
__itt_task_end(domain);
}

// Wait a while,...
sleep(10);
g_done = true;
__itt_task_end(domain);

// Mark the end of the main task
printf("Main task ends\n");
return 1;
}

__itt_string_handle* handle_work = __itt_string_handle_create("work");

void* workerthread(void* data)
{
char threadname[32];
sprintf(threadname, "Worker Thread %p", data);
printf("%s begins\n", threadname);

while (!g_done)
{
__itt_task_begin(domain, __itt_null, __itt_null, handle_work);
printf("%s is working\n", threadname);
usleep(150000);
__itt_task_end(domain);
}

printf("%s ends\n", threadname);
return NULL;
}

 

0 Kudos
yuzhang3_intel
Moderator
2,241 Views

I tried your sample code, it can profile process with itt information.

yuzhang3_intel_0-1721620619367.png

 

Application running console window:

$ ./main  11

Click 'Enter' until VTune profiling command line is run below.

 

Profiling console window:

$ vtune -collect hotspots -knob sampling-mode=hw --target-process main

 

 

0 Kudos
yidu
Novice
2,223 Views

Hi @yuzhang3_intel,

It seems that the issue isn't dependent on the application code, correct?

Are there any other ways to debug the problem I'm encountering, such as VTune debug logs?

I really hope the attach mode will work to resolve the instrumentation problem. I've had similar issues that depend on it.

For context, my remote system is FreeBSD. I'm able to perform deep debugging at the system level, but I need some guidance on how ITT works during instrumentation.

Thanks a lot.

0 Kudos
yuzhang3_intel
Moderator
2,208 Views

I am not sure why the ITT doesn't support attach mode in your local environment. The same sample code can work fine on my side.

 

Could you attach your VTune data?

0 Kudos
yidu
Novice
2,202 Views

Sure, @yuzhang3_intel,

this is folder's package after VTune collection.

hope we could found something here.

thanks.

0 Kudos
yuzhang3_intel
Moderator
2,193 Views

From the VTune data, ITT information was not collected. 

 

Could you clarify how you profile the process on the target?

0 Kudos
yidu
Novice
2,189 Views

Sure,

  1. Kick up the ittmain process by startup script from remote ssh connection.
  2. take the process id of ittmain.
  3. double check INTEL_LIBITTNOTIFY64 environment variant's value the from output.
  4. Fill process id and let vtune profile started from linux side
  5. press any key let ittmain to go.
  6. vtune will collection the logs after few seconds (should be 10 seconds) automatically.
  7. check result we got.

this is the vtune command line.

/home/yidu/intel/oneapi/vtune/2024.2/bin64/vtune -target-system ssh:root@10.151.122.141 -target-install-dir=/home/yidu/tmp/ -target-tmp-dir=/home/yidu/tmp -collect hotspots-0 --target-pid 66690

 

 

0 Kudos
Reply