Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4994 Discussions

Application Silently Stops

Caesar
Beginner
1,055 Views

Hello there!

I am using VTune + Libittnotify to identify and profile some task regions in a few programs. However, I am seeing a weird behavior in some programs. What happens is that when I enable the "Analyze user tasks" in any analyze, the execution of the program is stopped "somewhere in the middle" and the analyze session continues and finishes as if nothing wrong had happened (no warning or error message appear anywhere). When I disable "Analyze user tasks" the program execute until completion and everything works just fine. 

How can I debug the execution of the target program to see what is happening? The 'program' only crashes when 'Analyze user tasks' is enabled in VTune.

0 Kudos
9 Replies
Peter_W_Intel
Employee
1,055 Views

This is an interest report!

1. Can you see user-task mark in report after doing data collection?

2. The problem stopped at somewhere in middle when enabling "Analyze user task", did you see expected result? Was it possible due to execution path to impact on __itt_task_begin() / __itt_task_end()? Have seen any warning or error report when/after data collecting?

3. Can you please run simple test case to verify if the problem can be reproduced - see example in this article

4. What is the version of VTune(TM) Amplifier XE you use?

5. If this is an application specific issue? Is it possible that you can provide test case for investigating on my side? Providing binary is OK - you can send me private message. If sending binary is impossible, please set:

a. export AMPLXE_DEBUG=1

b. export AMPLXE_LOG_LEVEL=TRACE

c. export AMPLXE_LOG_DIR=<your-log-dir>

Then run VTune again with / without "Analyze user task".

Please send me two VTune results and two logs. Thank you.

0 Kudos
Caesar
Beginner
1,055 Views

Hello Peter.

I will send you the binaries in PVT. Follow some additional details:

- When the collection stops there is no information about any task, nor even the domain. Only summary information appear.
- Vtune version is: Update 3 (build 403110) Copyright (C) 2009-2015 Intel Corporation. All rights reserved.
- I am executing on Linux 14.10

The mechanics of what I am doing is a little bit unusual. I am working on a mini OMP runtime library. When I compile the programs (using clang-3.5) I link them with Intel OMP RTL, however, when I need to test my library I link it dynamically using LD_PRELOAD.

I have identified a piece of code that is somewhat involved in the problem, but at least to me it does not make sense. There is a function like the one below that I use to do barrier synchronization. When I wrap the code of this function with "tasks" the problem happens. Interesting enough, the problem only happens when the calls to the ITT API are inserted in these exactly positions:

kmp_int32 __kmpc_omp_taskwait(ident* loc, kmp_int32 gtid) {
    __itt_task_begin(__itt_domain_name, __itt_null, __itt_null, __itt_Task_Name);

     ATOMIC_AND(..);
     ATOMIC_OR(..);
     while (...);
     ATOMIC_AND(...);
     while (...);

    __itt_task_end(__itt_domain_name);
    return 0;
}

 

0 Kudos
Peter_W_Intel
Employee
1,055 Views

I will check developer about using ITT APIs in __kmpc_omp_task_wait(), but is it possible that you can use ITT APIs outside of omp task wait? In my thoughts, task wait should return in short time, and __itt_task_begin() will call VTune library and it may spend more time. ITT APIs are not good to be used in run-time.

Also I hope to get your binary.

0 Kudos
Caesar
Beginner
1,055 Views

I am not sure if I understood you, but:

- I am using __itt_task_begin() all over the project and only when I add it to omp_taskwait that the analyze crashes. Also, if I only add task_begin to omp_taskwait the program also crashes.

- Unfortunately, I don't have access to the source of the main application, only to the sources of the library, so I can't wrap the call to omp_taskwait with task_begin/end.

"In my thoughts, task wait should return in short time, and __itt_task_begin() will call VTune library and it may spend more time. ITT APIs are not good to be used in run-time."

What do you mean here? Are the calls to __itt_* asynchronous?

0 Kudos
Peter_W_Intel
Employee
1,055 Views

If you use __itt_task_begin() all over the project, it should be asynchronous, otherwise you may create different task domains. 

It seemed omp task wait function was used by different calls? Ensure that  __itt_task_begin() is not reenterable in your code. 

0 Kudos
Peter_W_Intel
Employee
1,055 Views

I have received your binaries, but binary crashed without using VTune APIs - detail info, I sent you in private message.

If the binary cannot run on my side, can you please do point 5 - I posted on 06/09/2015 10:30? logs and VTune results also help to investigate the issue. Thank you.

0 Kudos
Caesar
Beginner
1,055 Views

Answered in PVT.

0 Kudos
Peter_W_Intel
Employee
1,055 Views

Unfortunately, the problem still persists on 2016 version.

# amplxe-cl -version
Intel(R) VTune(TM) Amplifier XE 2016 (build 424694) Command Line Tool

# amplxe-cl -c advanced-hotspots -knob enable-user-tasks=true ./test.sh  ;  test.sh has contents: LD_PRELOAD=./library.so ./jacobi_taskdep

./test.sh: line 1: 31182 Segmentation fault      (core dumped) LD_PRELOAD=./library.so ./jacobi_taskdep

But using command line without the script, it works.

# LD_PRELOAD=./library.so amplxe-cl -c advanced-hotspots -knob enable-user-tasks=true ./jacobi_taskdep ; worked well

0 Kudos
Robert_L_Intel1
Employee
1,055 Views

Hi Peter, Caesar - I thought it might be helpful to realize that the ITTNotify API's are designed to be used in global/static context.  If your OPM function is serial as Peter suggests, then you should be ok in that regard.  However, there is no guarantee of ITTNotify order.  So, IOW, the ITTNotify tasks are not counted or deterministic - they are either on or off.

So, you can have a situation where when using a threading/parallel API, such as OMP, where multiple threads may be calling the same function with ITTNotify inside, you can get multiple "on's" before you get an off, or vice versus, one "on", then multiple offs.  This lack of determinism can cause hangs and other strange behavior, such as unexpected/unintended "data windows."

 

 

 

0 Kudos
Reply