Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Dharma
Beginner
45 Views

Help with undestanding the Vtune results.

Hello,
I ran a lightwieight hotspot analysis on my code. I get the result attached as csv file. Can you please help me with pointers to what i can do now to improve the speed of the program. Major portion of the time is spent in zgemm3m for amtrix multiplications and matrix inverse using zgesv (or getrf and getri ). I am not able to understand the timing information obtained.

My computer has dual quad core(E5240) 2.493 GHz
0 Kudos
3 Replies
Peter_W_Intel
Employee
45 Views

Thanks for your results of lightweight-hotspots. Usually you can identify performance issue based onCPI value on tophot functions, the smaller the better. Howeversome function which used SSE3/SSE4//AVX instructions, willhas big CPI value - itis reasonable(single instruction, multiple data)

So you may investigate source line - which caused highCPI value (small instruction retired, big CPU cycles spent). For MKL functions, they are well performance tuned functions...You only need toensure if you used them in right usage mode.

You mayuse Concurrency Analysis to know parallelsimof your program, work balance onthreads, cores' utilization, etc.

You may use LocksAndWaits Analysis to know wait time, which may cause stalls between threads.

Regards, Peter
Peter_W_Intel
Employee
45 Views

"..I am not able to understand the timing information obtained." - Thetime is shown onreport, was calculated by using this formula:
"CPU Unhalted Cycles" Event / CPU Frequency

Overhead of profiling timewas not considered, I guess.

Dharma
Beginner
45 Views

Thanks Peter,
I think i need to take a relook at the algorithm i am using. As you sugested, i will do the other analysis and see if there are any issues that may be bottlenecks.

thanks
Reddy
Reply