Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Divino_C_
New Contributor I
28 Views

Source Code Inspection, Strange CPU Time Reporting

Jump to solution

I'm using VTune Update 15 to profile the benchmark 401.Bzip2 from SPEC CPU 2006 benchmark suite, however I'm seeing a strange result when inspecting the source code to check which instructions are more time consuming. I really don't know how the result I'm seeing can be correct.

The problem is:

1) I use Advanced Hotspot Analysis to profile the code;

2) I go to the "Top-down Tree" tab and expand the main function; Which I see that is running for ~86s and is calling to other functions: spec_compress and spec_uncompress; So I double-click on the main function to check its source code;

3) Inspecting the source code of the main function I note that VTune claims that the line "spec_fd[1].limit=compressed_size*MB;" is executing for ~86s... the total execution time of the benchmark; I also note that in the lines corresponding to the invocations of the functions spec_compress/uncompress the Cpu time is shown as zero.

I've attached some screenshots. I would greatly appreciate any help on this!

 

0 Kudos
1 Solution
David_A_Intel1
Employee
28 Views

Hi Divino:

Did you build with optimizations turned on?  Did you view the assembly code for the function in question (i.e., while in the source view)?  Optimization changes code layout (inlining, etc.) and I suggest viewing the assembly with the collected data (see Assembly button in upper left of source view).

View solution in original post

8 Replies
David_A_Intel1
Employee
29 Views

Hi Divino:

Did you build with optimizations turned on?  Did you view the assembly code for the function in question (i.e., while in the source view)?  Optimization changes code layout (inlining, etc.) and I suggest viewing the assembly with the collected data (see Assembly button in upper left of source view).

View solution in original post

Divino_C_
New Contributor I
28 Views

Hello Mr. Anderson, how are you?

Yes, I profiled an optimized code.. It seems that the optimization was exactly the source of the problem. Thanks for reminding me of this issue.

So, if I want VTune to profile the code, my options are: 1) to profile the non-optimized code; 2) analyse the results using the assembly view?

 

TimP
Black Belt
28 Views

Did you consider setting -inline-debug-info or options like -no-ip or -fno-inline-functions?

Divino_C_
New Contributor I
28 Views

Actually I'm using GCC. What does the flag "-no-ip" disable?

Caesar
Beginner
28 Views

Hi,

I'm sorry to come back to this question again, but today I saw a some strange results again.

This time I'm using intel compiler (version 13.1.3) with these flags: "-O3 -debug inline-debug-info -no-ip -fno-inline-functions" and VTune is reporting that a call instruction in the source is being responsible for ~140s of the program execution (tot. execution time is ~350s)... this seems a little odd. Can you guys tell me what I'm doing wrong?

I've attached a screenshot of the result window.

TimP
Black Belt
28 Views

gcc doesn't share the debug-info options of icc, but -fno-inline-functions (icc linux) was taken directly from gcc.  -Qip- is the ICL windows equivalent to disable in-lining.

Caesar
Beginner
28 Views

Thanks for clarifying that. I've changed my experiments to use ICC.

Bernard
Black Belt
28 Views

Usually when building a benchmark do it without the optimization.The simplest example is looped N times call to library sin() function with the argument being constant value.In such a case optimizing compliler can evaluate at compile time the library call and calculate the value ahead of runtime thus eliminating the need to even execute that loop N times.

Reply