Software Archive
Read-only legacy content
17061 Discussions

vectorization intensity

Jianbin_F_
Beginner
786 Views

Hi, I am measuring the vectorization intensity (VPU_ELEMENTS_ACTIVE/VPU_INSTRUCTIONS_EXECUTED) on a kernel like matrix addition. But I found that the VI is around 8.7 on a single core of Xeon Phi 5110P (with double-precision data elements). But it is impossible to achieve a VI that is larger than 8, right? Does anybody have an explanation?

Jianbin

0 Kudos
3 Replies
Sumedh_N_Intel
Employee
786 Views

VPU_ELEMENTS_ACTIVE simply counts the number of vector operations. It does not differentiate between operations based on the data type of the operands. I believe that even though your code predominantly uses double precision data elements, there could be some more vector computations on operands which are not double precision. This could result in a vector intensity greater than 8. 

0 Kudos
Martin_K_6
Beginner
786 Views

Just to chime with a similar observation: for an inner loop comprised entirely of 4-active-lane fp32 vectors, VPU_ELEMENTS_ACTIVE reported above 8 on my first MIC VTune session today. So I am still confused about the semantics behaind this counter. Perhaps a small code primer with related VPU_ELEMENTS_ACTIVE comments on the side could clear up most questions on this subject?

0 Kudos
TimP
Honored Contributor III
786 Views

I suspect that even "inactive" lanes are counted, possibly including all the lanes even though several are masked off.  As pointed out earlier, there are double precision operations such as divide and sqrt which expand out to code requiring initial step using 16 wide approximation.  A high value is still a good sign that certain kinds of serial instructions don't dominate.  I'm not ready to be depressed when I don't see a satisfactory number or overly impressed by a high one.

0 Kudos
Reply