Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5253 ディスカッション

Seems ITT task event in user application could not be tracking in "profile system" mode

yidu
初心者
6,703件の閲覧回数

Hi,

I'm trying to optimize my application's performance on a FreeBSD system. I've installed VTune 2024.2.0 on my Ubuntu laptop and I'm using remote SSH to analyze the performance. I successfully performed hotspot analysis using hardware event-based sampling.

I created a sample application with ITT API enabled, deployed it to a remote FreeBSD machine, and added the INTEL_LIBITTNOTIFY64 environment variable in the startup script. When I use the "Launch Application" feature with this startup script, everything works fine, and I can see the user task events in the results. However, when I use the "Profile System" feature to instrument and manually run the same script, I can't see these user events at all.

Is this behavior by design, or is there something wrong with my environment?

ラベル(1)
1 解決策
yuzhang3_intel
モデレーター
5,248件の閲覧回数

Please ignore the #4. I hope the ITT on FreeBSD support can be helpful to you.

元の投稿で解決策を見る

39 返答(返信)
yuzhang3_intel
モデレーター
4,132件の閲覧回数

Please post your VTune command line, thanks

yidu
初心者
4,116件の閲覧回数

Hi @yuzhang3_intel , I run my script manually by ssh connection at sample time. this is my vtune command line.

/home/yidu/intel/oneapi/vtune/2024.2/bin64/vtune -target-system ssh:root@10.151.122.141 -target-install-dir=/home/yidu/tmp/ -target-tmp-dir=/home/yidu/tmp -collect hotspots -knob sampling-mode=hw -knob sampling-interval=6 -finalization-mode=full --duration unlimited

yuzhang3_intel
モデレーター
4,111件の閲覧回数

Ok. 

Can you tell me which ITT API you used? Event API, right?

yidu
初心者
4,103件の閲覧回数

some thing just like sample code provide by vtune.

// Forward declaration of a thread function.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <pthread.h>
#include <unistd.h>

#include <ittnotify.h>
__itt_domain* domain = __itt_domain_create("Example.Domain.Global");
__itt_string_handle* handle_main = __itt_string_handle_create("main");
__itt_string_handle* handle_createthread = __itt_string_handle_create("CreateThread");

void* workerthread(void*);
bool g_done = false;

int main(int argc, char* argv[])
{
// Create a task associated with the "main" routine.
//
if (sizeof(void*) > 4) {
char * lib_name_64 = getenv("INTEL_LIBITTNOTIFY64");
if (lib_name_64)
printf("go env INTEL_LIBITTNOTIFY64: %s \n", lib_name_64);
else
printf("do not found env INTEL_LIBITTNOTIFY64 \n");

} else {
char * lib_name_32 = getenv("INTEL_LIBITTNOTIFY32");
if (lib_name_32)
printf("go env INTEL_LIBITTNOTIFY32: %s \n", lib_name_32);
else
printf("do not found env INTEL_LIBITTNOTIFY32 \n");
}


__itt_task_begin(domain, __itt_null, __itt_null, handle_main);
printf("Main task begins\n");

int av = __itt_collection_state();
printf("collector state %d\n", av);

// Now we'll create 4 worker threads
pthread_t threads[4];
for (int i = 0; i < 4; i++)
{

__itt_task_begin(domain, __itt_null, __itt_null, handle_createthread);
printf("Creating worker thread %d\n", i);
int ret = pthread_create(&threads[i], NULL, workerthread, (void*)i);
if (ret != 0)
{
printf("Failed to create worker thread %d\n", i);
exit(1);
}
__itt_task_end(domain);
}

// Wait a while,...
sleep(30);
g_done = true;
__itt_task_end(domain);

// Mark the end of the main task
printf("Main task ends\n");
return 1;
}

__itt_string_handle* handle_work = __itt_string_handle_create("work");

void* workerthread(void* data)
{
// Set the name of this thread so it shows up in the UI as something meaningful
char threadname[32];
sprintf(threadname, "Worker Thread %p", data);
printf("%s begins\n", threadname);

// Each worker thread does some number of "work" tasks
while (!g_done)
{
__itt_task_begin(domain, __itt_null, __itt_null, handle_work);
printf("%s is working\n", threadname);
usleep(150000);
__itt_task_end(domain);
}

printf("%s ends\n", threadname);
return NULL;
}

yuzhang3_intel
モデレーター
4,052件の閲覧回数

So your concern is that no Task shown below from the VTune result when profiling the system, right?

yuzhang3_intel_0-1721285732848.png

 

yidu
初心者
4,036件の閲覧回数

Yes, it is. in my environment, this result could be got, only if i use "launch application" options.

yuzhang3_intel
モデレーター
4,033件の閲覧回数

Ok, let me check internally and then give you an update.

yuzhang3_intel
モデレーター
3,975件の閲覧回数

After confirmation, the VTune ITT does not support profile system mode but should support launch/attach modes.

yidu
初心者
3,969件の閲覧回数

OK, Seems this is by design.

My Application could not running directly by ssh command due to it base on GTK and other resource in my system.

Don't know if there have other experimental solution in next version will help me on this?

Thanks.

yuzhang3_intel
モデレーター
3,943件の閲覧回数

Got it.

You can profile one process on the target device using VTune remote access mode, even though you can't directly launch your application using SSH.  I am not sure if it is useful to profile process for your application performance analysis.

yidu
初心者
3,935件の閲覧回数

I just try what's your suggest ways. run my application first then attach the process id for single process profile. but still could not found the task event from result.  below is my vtune command line. please double check what is going on.

/home/yidu/intel/oneapi/vtune/2024.2/bin64/vtune -target-system ssh:root@10.151.122.141 -target-install-dir=/home/yidu/tmp/ -target-tmp-dir=/home/yidu/tmp -collect hotspots-0 --target-pid 96100

yuzhang3_intel
モデレーター
3,907件の閲覧回数

It may be that when VTune is attached to the running process, the __itt_xxxx() function has already been called, and VTune cannot capture the profiled task because the collection start time happened later. You need to guarantee the __itt_xxxxx() function is called after VTune attaches the process.

yidu
初心者
3,871件の閲覧回数

in my scenario, these code running should beyond the collection time.

 

__itt_domain* domain = __itt_domain_create("Example.Domain.Global");
__itt_string_handle* handle_main = __itt_string_handle_create("main");
__itt_string_handle* handle_createthread = __itt_string_handle_create("CreateThread");

__itt_string_handle* handle_work = __itt_string_handle_create("work");

 

this code must have a change running with in collection time.

__itt_task_begin(domain, __itt_null, __itt_null, handle_work);
printf("%s is working\n", threadname);
usleep(150000);
__itt_task_end(domain);

 

Sure I will figure out how to make sure whole code could running under collection time. and feedback the reslults. 

thanks @yuzhang3_intel 

 

yidu
初心者
3,751件の閲覧回数

Some modify in my code. use getkey to pause if have additional arbitrary arguments.

same code for launch and attach by id is different. result have task event in launch application but none in attached mode.

help me on this issues please.

// Forward declaration of a thread function.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <pthread.h>
#include <unistd.h>

#include <ittnotify.h>
__itt_domain* domain = NULL;
__itt_string_handle* handle_main = NULL;
__itt_string_handle* handle_createthread = NULL;

void* workerthread(void*);
bool g_done = false;

void init_itt(void) {
domain = __itt_domain_create("Example.Domain.Global");
handle_main = __itt_string_handle_create("main");
handle_createthread = __itt_string_handle_create("CreateThread");
}

int main(int argc, char* argv[])
{
if (argc > 1) {
printf("Any key press here \n");
getchar();
}

init_itt();

if (sizeof(void*) > 4) {
char * lib_name_64 = getenv("INTEL_LIBITTNOTIFY64");
if (lib_name_64)
printf("go env INTEL_LIBITTNOTIFY64: %s \n", lib_name_64);
else
printf("do not found env INTEL_LIBITTNOTIFY64 \n");

} else {
char * lib_name_32 = getenv("INTEL_LIBITTNOTIFY32");
if (lib_name_32)
printf("go env INTEL_LIBITTNOTIFY32: %s \n", lib_name_32);
else
printf("do not found env INTEL_LIBITTNOTIFY32 \n");
}

__itt_task_begin(domain, __itt_null, __itt_null, handle_main);
printf("Main task begins\n");

int av = __itt_collection_state();
printf("collector state %d\n", av);

// Now we'll create 4 worker threads
pthread_t threads[4];
for (int i = 0; i < 4; i++)
{

__itt_task_begin(domain, __itt_null, __itt_null, handle_createthread);
printf("Creating worker thread %d\n", i);
int ret = pthread_create(&threads[i], NULL, workerthread, (void*)i);
if (ret != 0)
{
printf("Failed to create worker thread %d\n", i);
exit(1);
}
__itt_task_end(domain);
}

// Wait a while,...
sleep(10);
g_done = true;
__itt_task_end(domain);

// Mark the end of the main task
printf("Main task ends\n");
return 1;
}

__itt_string_handle* handle_work = __itt_string_handle_create("work");

void* workerthread(void* data)
{
char threadname[32];
sprintf(threadname, "Worker Thread %p", data);
printf("%s begins\n", threadname);

while (!g_done)
{
__itt_task_begin(domain, __itt_null, __itt_null, handle_work);
printf("%s is working\n", threadname);
usleep(150000);
__itt_task_end(domain);
}

printf("%s ends\n", threadname);
return NULL;
}

 

yuzhang3_intel
モデレーター
3,740件の閲覧回数

I tried your sample code, it can profile process with itt information.

yuzhang3_intel_0-1721620619367.png

 

Application running console window:

$ ./main  11

Click 'Enter' until VTune profiling command line is run below.

 

Profiling console window:

$ vtune -collect hotspots -knob sampling-mode=hw --target-process main

 

 

yidu
初心者
3,722件の閲覧回数

Hi @yuzhang3_intel,

It seems that the issue isn't dependent on the application code, correct?

Are there any other ways to debug the problem I'm encountering, such as VTune debug logs?

I really hope the attach mode will work to resolve the instrumentation problem. I've had similar issues that depend on it.

For context, my remote system is FreeBSD. I'm able to perform deep debugging at the system level, but I need some guidance on how ITT works during instrumentation.

Thanks a lot.

yuzhang3_intel
モデレーター
3,707件の閲覧回数

I am not sure why the ITT doesn't support attach mode in your local environment. The same sample code can work fine on my side.

 

Could you attach your VTune data?

yidu
初心者
3,701件の閲覧回数

Sure, @yuzhang3_intel,

this is folder's package after VTune collection.

hope we could found something here.

thanks.

yuzhang3_intel
モデレーター
3,692件の閲覧回数

From the VTune data, ITT information was not collected. 

 

Could you clarify how you profile the process on the target?

yidu
初心者
3,688件の閲覧回数

Sure,

  1. Kick up the ittmain process by startup script from remote ssh connection.
  2. take the process id of ittmain.
  3. double check INTEL_LIBITTNOTIFY64 environment variant's value the from output.
  4. Fill process id and let vtune profile started from linux side
  5. press any key let ittmain to go.
  6. vtune will collection the logs after few seconds (should be 10 seconds) automatically.
  7. check result we got.

this is the vtune command line.

/home/yidu/intel/oneapi/vtune/2024.2/bin64/vtune -target-system ssh:root@10.151.122.141 -target-install-dir=/home/yidu/tmp/ -target-tmp-dir=/home/yidu/tmp -collect hotspots-0 --target-pid 66690

 

 

返信