Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5113 Discussions

Advisor annotations in a shared object written in NASM

Jones__Brian
New Contributor I
10,290 Views

I ran Advisor 2021.1.1 with a shared object written in NASM, assembled with the NASM assembler and linked with ld. The shared object is called by a wrapper written in C.

I defined the annotation macros:

%define ANNOTATE_SITE_BEGIN(L_Functions)
%define ANNOTATE_ITERATION_TASK(for_taskA)
%define ANNOTATE_SITE_END(L_Functions)

Then I placed the "site begin" string at the top of the main loop label, and the "interation task" string

ANNOTATE_SITE_BEGIN(L_Functions)
label_401:

ANNOTATE_ITERATION_TASK(for_taskA)
movsd xmm0,[r13+r12]
. . .

Finally, I placed the "site end" string at the end of the main loop:

add r12,8
cmp r12,[rbp-16]
jl label_401

ANNOTATE_SITE_END(L_Functions)

The program assembles and links correctly, but Advisor doesn't recognize the annotations. For example, the suitability report gives this message:

advisor: Error: Error 0x40000024 (No data) -- No data is collected. Possible reasons:
- Workload is too small. No samples are collected.
- The application environment is not specified correctly.
- The executable file has been stripped so cannot be profiled with algorithm analysis types.
See the Troubleshooting help topic for more details.

I do get some information in the Advisor reports, but without annotations the information is not very useful.

Can Advisor work with shared objects written in assembly language (NASM in my case)?

 

0 Kudos
19 Replies
Ruslan_M_Intel
Employee
10,275 Views

As you probably noticed the workload size is too small. How long does it take to execute the app without profiling? Advisor gathers samples during workload execution (default sampling rate is 10ms) then analyzes collected data. In case of execution time of your app is less than few hundreds of milliseconds you don't get relevant profile.

PS On the other hand you could set sampling rate t 1ms and try again

0 Kudos
Jones__Brian
New Contributor I
10,260 Views

Thanks for your reply.

I ran the reports again, specifying "--interval=1" on the command line.

I ran two separate programs, each with a different workload. Each of these programs has a version written in C and compiled with the Intel C Compiler. The C versions include the same annotations and the same workloads. The the C versions recognize the annotations, whereas the NASM versions do not.

For example, the Dependencies report from today's NASM test says:

Dependencies:
advisor: Warning: No site annotations were executed, so no Dependencies issues (problems and messages) can be found.
advisor: Loading result... 25 % done

The MAP report report from today's NASM test says:

MAP:
advisor: Warning: No SITE annotations were encountered, so no stride/alignment data can be reported.

Both of the programs produced the same messages.

Either Advisor will not work with assembly language programs, or my annotations are not included correctly in the source. In my original question above, I show how I placed the annotations. I can send the entire source file if that would help.

But my first question is, will Advisor work with shared objects written in assembly language? If the answer is yes then it must be my placement of the annotation strings in the assembly source.

0 Kudos
Ruslan_M_Intel
Employee
10,244 Views

That's quite interesting. If it's acceptable for you I'd like to take a look at your source code. Probably I'll need to do some experiments too

0 Kudos
Jones__Brian
New Contributor I
10,233 Views

Attached is the .asm file with annotations for Intel Advisor. It assembles to a shared object, and I call it from a C wrapper.  Note that the file has a .asm.txt extension.  Intel's upload system will not accept an .asm file.  You will need to remove the .txt extension. 

To work with this file, you will also need seven .asm include files, six C object files and one .exe file for the C wrapper. The additional files total only 65K, but Intel's upload limit is seven files. If you would like to have them, please let me know how I can send them to you in view of the upload limit.

Thank you very much for looking at this code.

0 Kudos
Jones__Brian
New Contributor I
10,231 Views

The code did not attach to my previous message, so it is attached to this message. 

0 Kudos
Jones__Brian
New Contributor I
10,195 Views

I hope this question hasn't been overlooked. Is there anything else I can supply to help in answering this?

Thanks.

 

0 Kudos
Ruslan_M_Intel
Employee
10,191 Views

No, it hasn't but I haven't managed to find enough time to check this. I'm sorry. I'll try to speed it up.

0 Kudos
Ruslan_M_Intel
Employee
10,173 Views

I'm afraid you can't use annotation macros from ASM code you shared. The thing is, annotation macros are not supposed to be simply "called". They are actually mapped to specific calls during C preprocessing step and only then those calls are executed while profiling is running. I think you still have two options to achieve your goal. The first one is moving annotations from ASM to C wrapper (preferable). Second one is calling all needed routines from ASM code explicitly (ninja style). Fortunately the "advisor-annotate.h" file contains all the info you need (if you choose second option)

0 Kudos
Jones__Brian
New Contributor I
10,167 Views

Thanks for your reply. I will try moving the annotations to the C wrapper. It seems to me that, to be useful in the .asm code, I would have to pass pointers to functions in advisor-annotate.h from the C wrapper to the.asm code. I have to look at advisor-annotate.h first to understand how to do that.

 

0 Kudos
Jones__Brian
New Contributor I
10,153 Views

I will explain what I did and what happened, to see if you have any ideas.

I created a C program to encapsulate the ANNOTATE_SITE_BEGIN, ANNOTATE_ITERATION_TASK, and ANNOTATE_SITE_END macros (see (Complex_Calc_LinkIn.c, attached).  I compile that to an object file with the Intel oneAPI C compiler.

Next I link the object file into my NASM progam (using ld for linking). In the NASM program, I put "extern site_begin, iter_task, site_end" at the top of the program. Finally, I insert calls in the NASM program at the appropriate points. For example:

call [rel site_begin wrt ..got] ; Intel Advisor
label_401:
call [rel iter_task wrt ..got] ; Intel Advisor
movupd xmm0,[r13+r12]

The program runs to completion with no errors. However, when running with Advisor, I get:

No data is collected. Possible reasons:
- Workload is too small. No samples are collected.
- The application environment is not specified correctly.
- The executable file has been stripped so cannot be profiled with algorithm analysis types.

from the command string:

advixe-cl --collect=suitability --project-dir=/usr/test/Complex_Calc/013121 --interval=1 ./Call_Create_Threads_in_C-Complex_Calc.exe

And there is no file for suitability in the /usr/test folder.

I know the workload is not too small because I successfully ran the same workload with a similar program in C, and I set interval=1.

It would be great to use Advisor from an assembly language (NASM) program. I hope with this information you have some ideas.

Attached is Complex_Calc_LinkIn.c, Complex_Calc_YZ.asm.txt and Intel Advisor 013121.txt. The .asm file has a .txt extension as well because Intel will not allow me to upload a .asm file. This is run on Ubuntu 18.04.

Thanks for any ideas to try.

0 Kudos
Ruslan_M_Intel
Employee
10,114 Views

I suggest calling ITT API initialization function as well. In case of C program you do that implicitly but it doesn't work for ASM 

0 Kudos
AthiraM_Intel
Moderator
10,066 Views

Hi Brian,


Could you please give us an update? Is your issue resolved?


Thanks.


0 Kudos
Jones__Brian
New Contributor I
9,995 Views

Thanks for your reply. I'll explain what I did with the ITT API.

According to the Advisor User Guide (p. 300), "The ittnotify header file contains definitions of ITT API routines and important macros that provide the correct logic of API invocation from an application."

I read the entire ittnotify header file and downloaded the github project at https://github.com/intel/ittapi.

As before, I linked in a C program but this time I added:

#include "/opt/intel/oneapi/advisor/2021.1.1/sdk/include/ittnotify.h"

I also added an include for libittnotify.h, but got a compiler message (from the Intel C Compiler) that libittnotify has been deprecated and should not be linked.

In your last reply you suggested calling ITT API initialization function. It was not clear from ittnotify.h what I would call for initialization. I added the ittnotify.h file but once again the output was the same as before:

*****
advisor: Error: Error 0x40000024 (No data) -- No data is collected. Possible reasons:
- Workload is too small. No samples are collected.
- The application environment is not specified correctly.
- The executable file has been stripped so cannot be profiled with algorithm analysis types.
See the Troubleshooting help topic for more details.
Also consider checking the collection log for additional information.
*****

There are also some initialization items in ittnotify_static.c and ittnotify_static.h from the Github project, but again it's not clear to me which ones I would call to initialize the ITT API.

So my question is, what macros or functions do I call to initialize the ITT API? Is it a macro from the ittnotify.h file, or a function in one of the files in the Github source?

I attached my latest C program that I link into the nasm program. If I know which macro or function then I can put it into my C file and call it the same way as the site-begin and site-end macros as described in my previous post.

Thanks for your continued interest in this question.

 

0 Kudos
AthiraM_Intel
Moderator
9,952 Views

Hi,


We are trying to reproduce the issue, will get back to you soon.


Thanks.


0 Kudos
AthiraM_Intel
Moderator
9,863 Views

Hi,

 

Sorry for the delay.

We tried one ITT sample and able to run without any issue. 

 

We are attaching the sample file and also sharing the steps followed.

 

step1: 

 

icpc -g nqueens_serial.cpp -I/opt/intel/oneapi/advisor/latest/sdk/include /opt/intel/oneapi/advisor/latest/sdk/lib64/libittnotify.a -lpthread -o <ouput_file_name>

 

Step2:

 

advixe-cl --collect=survey --project-dir=./myAdvisorProj --start-paused -- ./<ouput_file_name>

 

Step 2 will take about 3 minutes for completion.

Please find the attached sample code and screenshots of advisor result.

 

Could you please try the same and let us know the updates.

 

 

Thanks.

 

0 Kudos
Jones__Brian
New Contributor I
9,818 Views

Thank you for your reply. We are making progress.

My original question on 011821 (above) was about using Advisor annotations in a shared object written in assembly language. On 021021 Ruslan_M suggested "calling ITT API initialization function as well." In my reply on 021721 I asked "what macros or functions do I call to initialize the ITT API?" They would have to be inserted into my NASM program at appropriate points.

You replied on 030421 with an example written in C++. My question is about a shared object written in assembly (NASM) called by an C wrapper. In your C++ code you started your Advisor run paused, then called "__itt_resume();" so I thought that was the answer to my question "what macros or functions do I call to initialize the ITT API?" I could start Advisor paused, then call "resume" (from ittnotify.h) in the NASM program.

To use the annotation macros and ITT API calls in my NASM program, I put them in a C program that I link into my NASM program like this:

sudo nasm -f elf64 -g -F dwarf ListComp_01.asm

sudo ld -shared ListComp_01.o /opt/P01_SH/_Library/Create_Threads_in_C-ListComp_01.o /opt/P01_SH/_Library/Timer_for_NASM.o /opt/P01_SH/_Library/POSIX_Shared_Memory.o /opt/P01_SH/_Library/PThread_Mutex.o /opt/P01_SH/_Library/ListComp_01_LinkIn_ITT.o -ldl -lrt -lpthread -o ListComp_01.so

Then I can call the relevant macros / functions at the appropriate point in my NASM program.

I have attached the latest version of my C program (ListComp_01_LinkIn_ITT.c) and my NASM program. I added two new functions: "void itt_resume()" and "void itt_pause()." The resume function is called from NASM this way: call [rel itt_resume wrt ..got]. Under the label "call_starts" in the NASM program, I call itt_resume (assuming Advisor has been started in paused mode, as you show in your C++ example) and the macro ANNOTATE_SITE_BEGIN:

%include "/opt/P01_SH/_Include_Utilities/Registers_Push_NoAVX.asm"
call [rel itt_resume wrt ..got] ; Intel Advisor
call [rel site_begin wrt ..got] ; Intel Advisor
%include "/opt/P01_SH/_Include_Utilities/Registers_Pop_NoAVX.asm"

label_401:

%include "/opt/P01_SH/_Include_Utilities/Registers_Push_NoAVX.asm"
call [rel iter_task wrt ..got] ; Intel Advisor
%include "/opt/P01_SH/_Include_Utilities/Registers_Pop_NoAVX.asm"

Following your C++ example, I compile the C link-in program this way:

icx -O0 -fpic -gdwarf-4 -shared -I/opt/intel/oneapi/advisor/latest/sdk/include /opt/intel/oneapi/advisor/latest/sdk/lib64/libittnotify.a -lpthread -o ListComp_01_LinkIn_ITT.o ListComp_01_LinkIn_ITT.c

When I run the program (without Advisor) I get this error:

dlopen error: /opt/P01_SH/_Library/ListComp_01_LinkIn_ITT.o: undefined symbol: __itt_resume_ptr__3_0

At this point it looks like I'm getting closer to knowing what to call in NASM to invoke the ITT_API and the Advisor annotation macros.

My questions now are: (1) am I correct that itt_resume after starting Advisor paused is the right way to invoke the ITT_API within my NASM program and (2) if so, how do I resolve the error I show above?

Thank you very much for your continued assistance with this question.

 

0 Kudos
Vladimir_T_Intel
Moderator
8,576 Views

Hi,

Sorry for abandoning this topic as it was de-prioritized due to unsupported model of using annotations from inside asm programs. 

I understand, it might be already not relevant, but I see a couple questions unanswered in your last message. From my point of view, calling itt_resume after starting Advisor paused is a correct way to du analysis. As for the problem with undefined symbol: __itt_resume_ptr__3_0, it looks like a problem with static linking of itt library to the project. Not sure what exactly was wrong, it needs analysis of a Makefile and reproducing.. But as I mentioned, the development team have many other things with higher priorities at the moment. 

0 Kudos
AthiraM_Intel
Moderator
9,774 Views

Hi,


We are looking into this internally and will get back to you soon.


Thanks.


0 Kudos
Vladimir_T_Intel
Moderator
8,443 Views

Hi,

This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.

0 Kudos
Reply