topic Copy from one vector to another via an index map using MKL? in Intel® oneAPI Math Kernel Library

Copy from one vector to another via an index map using MKL?

zer0nes — Fri, 29 Mar 2013 21:36:17 GMT

Hi, can I utilize MKL (or any other library) for the following operations?

[cpp]

for (int i=0; i<N; i++)

y = x[map]

[/cpp]

[cpp]

for (int i=0; i<N; i++) x[map] += y

[/cpp]

The second operation looks impossible to parallelize using SSE or OpenMP, does it?

Thanks.

There are two BLAS level-1

Zhang_Z_Intel — Fri, 29 Mar 2013 21:59:49 GMT

There are two BLAS level-1 functions that are exactly for these purposes: GTHR and ROTI. Please look here and here. Both are vectorized and parallelized in MKL.

Thanks!

zer0nes — Fri, 29 Mar 2013 22:37:21 GMT

Thanks!

Zhang Z (Intel) wrote:

There are two BLAS level-1 functions that are exactly for these purposes: GTHR and ROTI. Please look here and here. Both are vectorized and parallelized in MKL.

Unfortunately, roti doesn't work in my case because it requires that the values in indx must be distinct. In my case, the first operation is an expansion while the second one is a contraction. My indx has many duplicated values.

Thanks.

zer0nes — Sat, 30 Mar 2013 05:24:00 GMT

Thanks.

Unfortunately, ROTI doesn't apply in my case because it requires that indx has unique values. My indx has many duplicated values. Basically, the first operation is an expansion so that BLAS's gemm can be called. The second operation is the reduction.

It the second operation is not vectorizable, will I be able to utilize MKL if I change the second operation to the following?

[csharp]

for (int i=0; i<M; i++) {

int[] indices = map; // map is int[][]

for (int j=0; j<indices.Length; i++)

x += y[indices];

}

[/csharp]

As long as you have repeated

TimP — Sat, 30 Mar 2013 16:53:06 GMT

As long as you have repeated values in map[], vectorization or parallelization introduces indeterminacy on which of the repeated values takes final effect. If you don't care which of those takes effect, promoting parallelization by the MKL function or by assertions such as #pragma ivdep in your code could be acceptable. The resulting race conditions could restrict the performance gain if there are enough of them.

Can I parallelize with MKL

zer0nes — Sun, 31 Mar 2013 06:26:00 GMT

Can I parallelize with MKL for the modified update algorithm?

[csharp]

for (int i=0; i<M; i++) {

int[] indices = map; // map is int[][]

for (int j=0; j<indices.Length; i++)

x += y[indices];

}

[/csharp]

TimP (Intel) wrote:

As long as you have repeated values in map[], vectorization or parallelization introduces indeterminacy on which of the repeated values takes final effect. If you don't care which of those takes effect, promoting parallelization by the MKL function or by assertions such as #pragma ivdep in your code could be acceptable. The resulting race conditions could restrict the performance gain if there are enough of them.

There isn't an MKL function

Zhang_Z_Intel — Mon, 01 Apr 2013 21:34:27 GMT

There isn't an MKL function for this (when there are duplicate values in the index vector). But you can use the Intel compiler to vectorize/parallelize your own implementation. For example, you can parallelize the outer loop with OpenMP parallel for, and vectorize the inner loop with #pragma ivdep and other vectorization pragmas. You can check whether your code is successfully vectorized or not by using the "-vector-report" option of Intel compiler. Vectorization is a big topic by itself. There are many things you can do to make your code vectorize better. This page (http://software.intel.com/en-us/intel-vectorization-tools) is the ultimate guide for all you need to know about vectorization with Intel compilers. If you are in a hurry, you can start with this article: http://software.intel.com/en-us/articles/a-guide-to-auto-vectorization-with-intel-c-compilers/