Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Parallelization of dyadic product

Fabio_G_
Beginner
950 Views

Hi,

I have two vectors (they can address the same vector) and I need to perform the product x*y with i,j=1..n.

What is the best way to perform this operation in parallel? I've tried

cilk_for(h=0;h<n*n;h++)r=x[h/n]*y[h%n];

but I guess it is only a naive tentative to do that. Indeed vec-report says it is uneffcient.

Thanks.

Fabio

 

0 Kudos
3 Replies
TimP
Honored Contributor III
950 Views

If you wish to use Cilk(tm) Plus notation, this might be more topical on the Cilk forum.

I expect you may need to write something like

cilk_for(int j=0; j<n; ++j) r[j*n:n] = x * y[0:n];

You are right in looking for a combined parallel vector code generation (assuming large enough n to use multiple cores).

If using a recent compiler, you should get a more explicit warning than "seems inefficient" when you try a shared induction variable in cilk_for, even if you compile this as C rather than C++.

OpenMP would likely be more efficient (if less attractive), particularly on a NUMA platform.  I mention it only because you said "best way." 

0 Kudos
Fabio_G_
Beginner
950 Views

Hi Tim,

first thanks for the answer. About Cilk forum I guess you're right. About compiler, I have icpc v13

that, I know, is a bit old but it does its work. Should I use a more recent compiler? There is a "free"

version available? I could download the version I have a couple of years ago from the Intel website

and I am used to install that.

Strictly, speaking about the code you proposed I guess it is ok for what I need. I am just a final user

running extensive numerical simulations so the important is that works fine. I know few things about

OPENMP so...

Thank you!

0 Kudos
Reply