Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
15 Views

[Vtune Noob] First steps with Vtune Amplifier

Hi,

We have a little cluster with the Intel CLuster Toolkit. We develop in our research labor a Fluid-Structure Interaction program. This one is indeed 3 programs:
- a fluid solver written in Fortran;
- a solid solver written in C++;
- a coupling program written in C++.
The 3 programs communicate with MPI.

Our goal is to optimize these programs, particularly the FORTRAN program. So I have searched and found Vtune Amplifier. I have installed it for evaluation. And I have questions:

- With which compilation options should the program compiled ? "-g -02" ? Are there compilation options, which are known to do some troubles in Vtune ?

- In order to start our program we are using mpirun:
for example:
mpirun -np 1 "coupling program"; -np 1 "solid solver"; -np 5 "fluid solver"
What is the best way to start Vtune with this type of mpirun line ?

- at the moment I have tested Vtune just on our "fluid solver". I have created a script with
mpirun -np 5 "fluid solver"
And started Vtune on it. I got the results. But it seems that there is a problem for one of the subroutine: the displayed time is written for a comment line ?!? or for a line with nothing (see figure in attachment)...I can't read assembly code...So I don't know if this is a bug just for the Fortran source code or for the assembly code too. What can I do ?

Thx a lot,
Best regards,
Guillaume
0 Kudos
8 Replies
Highlighted
Employee
15 Views

Sometime using compiler option "-O2"mightcause performance data dropped on a comment line, please disable optimization options.

Usually I used below to collect performance data:
amplxe-cl -collect hotspots -r r0002hs --mpiexec.hydra -bootstrap fork-np 4 ./pi.gcc

Please see my articlefor more detail steps.

It seemed that you ran system wide data collection (manually run application first: with mpirun). It should be OK!

Regards, Peter
0 Kudos
Highlighted
15 Views

Hi Peter,

Thx for your answer. In order to solve my problem with the comments as described in the figure, I have to generate my source file with all the includes:
gcc -E -P -DINTEL -DU77 -DGNU -Isrc/include src/test.F > test.F_withinclude

Then I copy the file test.F_withinclude in the place of my original test.F. Then Vtune starts and shows very interesting results, which make sense :D

Regards!
Guillaume
0 Kudos
Highlighted
Employee
15 Views

Execution code (CPU spent time) locates in include file. Sounds great!
0 Kudos
Highlighted
Beginner
15 Views

Peter

My vTune traces for a VS C++ application have stopped displaying stack information and the call tree ?

it prints:

[No call stack information]

please help !

 

0 Kudos
Highlighted
Employee
15 Views

@ Stephen T

What kind of analysis type did you use? Hotspots analysis will collect stack info automatically if you had debug info built.

If you use advanced-hotspots, please add option "-knob collection-detail=stack-sampling" which will collect call stack information,

BTW, VTune(TM) Amplifier XE 2015 Update 1 is ready for now. 

0 Kudos
Highlighted
Employee
15 Views

@ Stephen T

I saw you had another thread 534715, which attached hotspots result r021hs.

In that report, there was more IDLE time (1.353s) than serial time (0.651s), and top 1-N hot functions were dropped in ntdll.dll, MSVCR100.dll, and kernel32.dll. Hot function "std::getline<...>" in your module Test.exe only took 6.002ms with [No call stack information]. Actually, I doubt that your function was called by ntdll.dll which has no symbol info, this caused no stack info to be displayed.

You can try to build test case which spends more CPU time, and make a caller in your code which call your hot function(s) - thus, caller will not be in ntdll.dll, to verify this issue. Hope it helps. 

0 Kudos
Highlighted
Employee
15 Views

Hi Stephen!

Please make sure you're building with "/Zi" (generate debug info) and there are .pdb files right near your executable/libraries.

0 Kudos
Highlighted
Black Belt
15 Views

>>> But it seems that there is a problem for one of the subroutine: the displayed time is written for a comment line ?!? or for a line with nothing (see figure in attachment)...I can't read assembly code.>>>

It seems that <Block 19> is a loop which is copying values from R15 register to different GP registers. I cannot see call instruction and I cannot see do loop construct in the source code on the left pane.Put it simply left pane does not correspond to right pane.

0 Kudos