Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4994 Discussions

VTune 2022 instrumentation doesn't provide results despite being linked with pthreads and ld libs

alexanderv
Beginner
2,561 Views

Hi 

I am trying to profile my application with VTune 2022.

I do it with attach option for both GUI and CLI.

I have instrumented my code and successfully built it.

My code consists of several libs both dynamic and static and I link  necessary VTune libs to every my lib in interest.

Every time when I get analysis reports it is missing instrumentation data.

What could be possible reasons for such missing?

Thanks

0 Kudos
28 Replies
VaradJ_Intel
Moderator
2,016 Views

Hi,


Thank you for posting in Intel Communities.


Please can you answer the following questions:


1. What is the exact version of VTune you are using?


2. Please can you give your OS details? 


3. Are you using the ITT APIs ?


4. Please can you share with which analysis you are trying to profile your application?


5. Please can you share the exact steps you followed and a sample reproducer(sample application which is similar to the application you are trying to profile)?


6. What kind of application you are trying to profile?

 

Thank you.


0 Kudos
alexanderv
Beginner
1,998 Views

Hi 

Preview

1. What is the exact version of VTune you are using?

Intel VTune Profiler 2022.2.0

Product buld: 623516

2. Please can you give your OS details? 

Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic

3. Are you using the ITT APIs ?

Yes

4. Please can you share with which analysis you are trying to profile your application?

HotSpots

5. Please can you share the exact steps you followed and a sample reproducer(sample application which is similar to the application you are trying to profile)?

I don't have a sample reproducer.

I have created global domain

#include "ittnotify.h"

__itt_domain* domain = __itt_domain_create("XSW-test_server.Domain.Global");

__itt_string_handle* handle_loop_wrapping = __itt_string_handle_create("Loop_wrapping");
__itt_string_handle* handle_api_wrapping = __itt_string_handle_create("api_wraping");

 

Then I put next rows into my functions

__itt_frame_begin_v3(domain, NULL);
__itt_task_begin(domain, __itt_null, __itt_null, handle_loop_wrapping);
start_t = clock();
for (i = 0; i < n_cycles; i++) {
xsw_nexthop.ip_addr.addr.ipv4 = htonl(ntohl(xsw_nexthop.ip_addr.addr.ipv4) + ((i!=0) ? inc_val : 0));
if (prof_start_idx == i)
{
is_internal_profil = 1;
}
__itt_task_begin(domain, __itt_null, __itt_null, handle_api_wrapping);
status = xsw_nexthop_create(&xsw_nexthop, &nexthop_oid);
__itt_task_end(domain);
if (status != XSW_STATUS_SUCCESS) break;
nex_oid_array[i] = nexthop_oid;
}
__itt_task_end(domain);
__itt_frame_end_v3(domain, NULL);
__itt_detach();

Of course I have added necessary libraries to the link and have checked their presence in the executable.

 

 

6. What kind of application you are trying to profile?

it is Thrift based application implementing HAL layer of NOS like Sonic and configuring some proprietary developed ASIC.

0 Kudos
alexanderv
Beginner
1,953 Views

I don't see my reply and yours as well!!!!

0 Kudos
alexanderv
Beginner
1,940 Views

1. What is the exact version of VTune you are using?

2022.2.0

Product build 623516

 

2. Please can you give your OS details? 

Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic

 

3. Are you using the ITT APIs ?

Yes

 

4. Please can you share with which analysis you are trying to profile your application?

Hotspots

 

5. Please can you share the exact steps you followed and a sample reproducer(sample application which is similar to the application you are trying to profile)?

I don't have sample application that reproduce the problem.

My application is Thrift based application for development HAL for NOS like SONiC

 

 

0 Kudos
alexanderv
Beginner
1,920 Views

@VaradJ_Intel this is my code example. It is not reproducing the problem but at least shows the usage of VTUNE APIs

 __itt_domain* domain = __itt_domain_create("XSW-test_server.Domain.Global");
 
 __itt_string_handle* handle_loop_wrapping = __itt_string_handle_create("Loop_wrapping");
 __itt_string_handle* handle_api_wrapping = __itt_string_handle_create("api_wraping");

    void xsw_thrift_nexthop_create_loop(xsw_thrift_nexthop_loop_response_t &_return,
                                       const xsw_thrift_nexthop_loop_t &thrift_nexthop_loop) {
    xsw_nexthop_t xsw_nexthop = {};
    xsw_object_id_t nexthop_oid = 0;
    xsw_status_t status = XSW_STATUS_SUCCESS;
    clock_t start_t, exec_t;
    uint32_t i;
    uint32_t n_cycles;
    uint32_t inc_val;
    uint32_t prof_start_idx;
    xsw_object_id_t nex_oid_array[XSW_IPV4_HOST_TABLE_SIZE - 1];

    assign(xsw_nexthop, thrift_nexthop_loop);
    n_cycles = thrift_nexthop_loop.num_of_cycles;
    inc_val = thrift_nexthop_loop.inc_value;
    prof_start_idx = thrift_nexthop_loop.prof_start_idx;
#ifdef XPROFILE    
    is_internal_profil = 0;
#endif //XPROFILE
    XSW_LOG_DEBUG("xsw_thrift_nexthop_create_loop");
    xsal_error("<--XSW----LOOP-  xsw_nexthop_create_loop %d inc_val= %d,ip_addr=0x%x",n_cycles,inc_val,xsw_nexthop.ip_addr.addr.ipv4);
   
 __itt_resume();
 __itt_frame_begin_v3(domain, NULL);
 __itt_task_begin(domain, __itt_null, __itt_null, handle_loop_wrapping);
    start_t = clock();
    for (i = 0; i < n_cycles; i++) {
      xsw_nexthop.ip_addr.addr.ipv4 = htonl(ntohl(xsw_nexthop.ip_addr.addr.ipv4) + ((i!=0) ? inc_val : 0));
      if (prof_start_idx == i)
      {
#ifdef XPROFILE        
        is_internal_profil = 1;
#endif //XPROFILE        
      }
 __itt_task_begin(domain, __itt_null, __itt_null, handle_api_wrapping);
      status = xsw_nexthop_create(&xsw_nexthop, &nexthop_oid);
 __itt_task_end(domain);
      if (status != XSW_STATUS_SUCCESS) break;
      nex_oid_array[i] = nexthop_oid;
    }
 __itt_task_end(domain);
 __itt_frame_end_v3(domain, NULL);
 __itt_detach();
    exec_t = clock() - start_t;
0 Kudos
VaradJ_Intel
Moderator
1,901 Views

Hi,


Thank you for sharing the details.


Please can you share all files of report as zip file with us?


Meanwhile, You can also refer about ITT api from the below link:


https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/api-support/instrumentation-and-tracing-technology-apis.html#:~:text=The%20ITT%20API%20is%20a%20set%20of%20pure,can%20pause%20the%20analysis%20to%20reduce%20the%20overhead.


Thank You.


0 Kudos
alexanderv
Beginner
1,892 Views

Hi @VaradJ_Intel 

I have created sample application that reconstruct the problem in attach mode but works in launch mode

zip file attached 

 

0 Kudos
VaradJ_Intel
Moderator
1,853 Views

Hi,


We tried reproducing the issue with the files you sent, but we are able to see the instrumentation data in the report.


Please can you zip your result folder and share it with us for further investigation.


Note: Result folder has name similar to 'rXXXhs' where 'XXX' would be some number.


Thank you.


0 Kudos
alexanderv
Beginner
1,837 Views

@VaradJ_Intel 

Hi

21 launch, consists of -itt_ profiling data

22 attach, doesn't consist

0 Kudos
VaradJ_Intel
Moderator
1,820 Views

Hi @alexanderv ,

 

Thank You for providing the result folders.

 

We were able to reproduce the issue. We are working on it and we will get back to you with an update soon.

 

Thank you.

 

0 Kudos
alexanderv
Beginner
1,662 Views

Hi @VaradJ_Intel 

Do you have any progress regarding this issue?

Actually I need it badly

 

Thanks,

Alexander

0 Kudos
Jeffrey_R_Intel1
Employee
1,650 Views

Hello Alexander,

When you attach to a running application/process from VTune Profiler (CLI and GUI), to get the ITT API data, the application needs several environment variables set before it starts:

INTEL_LIBITTNOTIFY32=<install-dir>/lib32/runtime/libittnotify_collector.so

INTEL_LIBITTNOTIFY64=<install-dir>/lib64/runtime/libittnotify_collector.so

 

For more information, see https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/api-support/instrumentation-and-tracing-technology-apis/basic-usage-and-configuration/attaching-itt-apis-to-a-launched-application.html



0 Kudos
alexanderv
Beginner
1,638 Views

Hi @Jeffrey_R_Intel1 @VaradJ_Intel 

These paths were set definitely!

I have double checked everything related to install procedure!

alexanderv_0-1655450501502.png

It is definitely NOT the reason.

I still hope for real solution

Alexander

0 Kudos
Jeffrey_R_Intel1
Employee
1,614 Views

Hello Alexander,

Just to clarify, those environment variables were set in the context where the application (not the vtune command line) was started, correct?

Thank you.


0 Kudos
alexanderv
Beginner
1,603 Views

Hi @Jeffrey_R_Intel1 @VaradJ_Intel 

The variables were set on remote machine where application in question was running  and the set command was invoked in application folder.

There was installation of VTune too but for analysis I used GUI VTune application on my Windows laptop with remote attachment to this machine.

Thanks,

Alexander

0 Kudos
Jeffrey_R_Intel1
Employee
1,599 Views

Hello Alexander,


Thank you for providing the reproducer. Please try these changes in your test case.


In the example you provided, the child thread calls __itt_task_begin() only once with no guarantee it has not already been called when vtune attaches to the process. If I modify your example and move the ITT calls inside the loop and insert another sleep(1) so they are not being called in rapid succession, I can see the user tasks.


Likewise, in hworld.cpp, it is calling __itt_task_begin() without a matching __itt_task_end() immediately when it is created. If I add the __itt_task_end before entering the loop, I expect to see those user tasks. However, I had to change the hworld.cpp ITT domain creation to create a unique domain name from the child thread OR move the "dn =thdm" statement after the thread is created to see the main thread user tasks.


Please let me know if you can reproduce these experiments to see the user tasks and whether these changes may also be relevant to your real workload.


Thank you.


0 Kudos
alexanderv
Beginner
1,578 Views

Hi @Jeffrey_R_Intel1 @VaradJ_Intel 

I have tried the suggested by you in different approaches, but eventually it didn't work.

I have ensured attachment before invocation of __itt functions

Please see my last variant of code:

 

void* threadFunction(void* args)
{
thdm = __itt_domain_create("DomainThreadFunction");
handle_th = __itt_string_handle_create("ThFuncLoop");
    uint32_t idx=40;
    while (idx--) sleep(1);
__itt_task_begin(thdm, __itt_null, __itt_null, handle_th);
    idx=10;
    while(idx--)
    {
        sleep(1);
        printf("I am threadFunction. %d \n", idx);
    }
__itt_task_end(thdm);
// __itt_detach();

}

 

int main()
{
    pthread_t id;
    int ret;
    uint32_t i = 50;
    while(i--) sleep(1);
    printf("\nMain:Create string handles");


dm = __itt_domain_create("DomainMain");
handle_main = __itt_string_handle_create("XSW-main");
handle_loop = __itt_string_handle_create("XSW-loop");

__itt_task_begin(dm, __itt_null, __itt_null, handle_main);
    sleep(1);
__itt_task_end(dm);
    /*creating thread*/
    ret=pthread_create(&id,NULL,&threadFunction,NULL);
    if(ret==0){
        printf("\nbThread created successfully.\n");
    }
    else{
        printf("\nThread not created.\n");
        return 0; /*return from main*/
    }
    uint32_t idx=100;
    // dm = thdm;
    while(idx--)
    // while(1)
    {
__itt_task_begin(dm, __itt_null, __itt_null, handle_loop);
        sleep(1);
        printf("I am main function %d.\n",idx);  
__itt_task_end(dm);    
    }

// __itt_task_begin(dm, __itt_null, __itt_null, handle_main);
    printf("Hello World!\n");
    sleep(1);
// __itt_task_end(dm);
    return 0;
}
 
 
0 Kudos
Jeffrey_R_Intel1
Employee
1,574 Views

Hello Alexander,


What is your "uname -a" output?


With the following code:


void* threadFunction(void* args)

{

 thdm = __itt_domain_create("DomainThreadFunction");

 handle_th = __itt_string_handle_create("ThFuncLoop");

 uint32_t idx=10;

 while (idx--) sleep(1);

 idx=10;

 while(idx--)

 {

__itt_task_begin(thdm, __itt_null, __itt_null, handle_th);

  sleep(1);

  printf("I am threadFunction. %d \n", idx);

__itt_task_end(thdm);

 }

// __itt_detach();

 return (void *) &threadFunction;

}


int main()

{

 pthread_t id;

 int ret;

 uint32_t i = 10;

 while(i--) sleep(1);

 printf("\nMain:Create string handles");

 dm = __itt_domain_create("DomainMain");

 handle_main = __itt_string_handle_create("XSW-main");

 handle_loop = __itt_string_handle_create("XSW-loop");


 __itt_task_begin(dm, __itt_null, __itt_null, handle_main);

 sleep(1);

 __itt_task_end(dm);


 /*creating thread*/

 ret=pthread_create(&id,NULL,&threadFunction,NULL);

 if(ret==0){

  printf("\nbThread created successfully.\n");

 }

 else{

  printf("\nThread not created.\n");

  return 0; /*return from main*/

 }


 uint32_t idx=20;

 while(idx--)

 {

  __itt_task_begin(dm, __itt_null, __itt_null, handle_loop);

  sleep(1);

  printf("I am main function %d.\n",idx);

  __itt_task_end(dm);

 }


__itt_task_begin(dm, __itt_null, __itt_null, handle_main);

 printf("Hello World!\n");

 sleep(1);

__itt_task_end(dm);


 return 0;

}


If I run the application without the following commands:

export INTEL_LIBITTNOTIFY32=/opt/intel/oneapi/vtune/2022.2.0/lib32/runtime/libittnotify_collector.so

export INTEL_LIBITTNOTIFY64=/opt/intel/oneapi/vtune/2022.2.0/lib64/runtime/libittnotify_collector.so

I do not get ITT user tasks in the vtune output.

If I first set the environment variables before starting the application, vtune reports:


Top Tasks

Task Type Task Time Task Count Average Task Time

---------- --------- ---------- -----------------

XSW-loop  20.002s   20    1.000s

ThFuncLoop 10.001s   10    1.000s

XSW-main  2.000s   2    1.000s



0 Kudos
alexanderv
Beginner
1,570 Views

Hi @Jeffrey_R_Intel1 

You are right!

It was user issue.

I have set this variables under another user.

Thank you!

BTW my company would be interested to hire somebody to help us analyze our system or at least to provide some effective course of system and application analysis in the light of our specific needs. Is there any such support in Intel?

Thanks,

Alexander 

0 Kudos
Jeffrey_R_Intel1
Employee
1,559 Views

Hello Alexander,


Please first use our on-line resources, then contact your Intel sale representative for further hands-on assistance.


Intel(R) VTune(TM) Profiler Users Guide: https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top.html


Intel(R) VTune(TM) Profiler Cookbook (contains performance analysis methodologies, configuration, and tuning "recipes"): https://www.intel.com/content/www/us/en/develop/documentation/vtune-cookbook/top.html


Easier Profiling using Intel(R) VTune(TM) Profiler Server (56m:39s): https://techdecoded.intel.io/essentials/easier-profiling-of-cloud-cluster-and-embedded-systems-using-a-profiling-server/


VTune video tutorial (Topics: Introduction, CPU Architecture, Analysis Types, and Useful Tips – they are links to separate wiki pages with a video for each topic

https://hpc-wiki.info/hpc/Intel_VTune_Tutorial


Thank you.


0 Kudos
Reply