About vectorization performance on Coprocessors

Matara_Ma_Sukoy1 · ‎10-04-2013

Hi all,

I just want to confirm something and ask a quick question. In order to benefit from vectorization to full extent, data should lay in memory in a successive way. I am working on sparse matrix vector multiplication in which I do something like this;

[cpp]

for(int i = 0; i < nnz; ++i)

y = val * x[colInd];

[/cpp]

here I access val array sequentially but this is not necessariliy true for x vector.

If I am not accessing both arrays sequentially I am giving away additional speed up I will get otherwise right?
Will using a notation like "x[colInd]" affect vectorization even if it I was accessing x entries in a successive way (and result in a performance loss)?

Your thoughts are always welcome.

Regards

Matara Ma

Sumedh_N_Intel · ‎10-04-2013

Hi,

Yes, in general, you do get a better performance if your data is aligned and the memory accesses are sequential. Hence, you may see performance gains if you access both arrays sequentially. Also, indirect memory references are costly and lead to inefficient code. You can find more about this in the following article: http://software.intel.com/en-us/articles/fortran-array-data-and-arguments-and-vectorization

There are a number of other Best-Know-Methods (BKMs) for vectorization that can be found at http://software.intel.com/en-us/articles/vectorization-essential. Other compiler BKMs can be found at http://software.intel.com/en-us/articles/programming-and-compiling-for-intel-many-integrated-core-architecture.

I hope this helps.