Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Element-by-Element matrix multiplication

Gaston_N_
Beginner
753 Views

Hi all,

I have a simple task at hand. I want to compute an element-by-element multiplication of two matrices A and B.

If use C = A*B it delivers the result I want, but it is very slow.

Do you know any faster way to make this computation?

For instance, if A and B were vectors, vdmul would be an efficient way to proceed. I'm looking for something similar, but using matrices.

Thanks!

Gaston

0 Kudos
3 Replies
Zhen_Z_Intel
Employee
753 Views

Dear customer,

For the element-by-element multiplication, I am afraid there's no specify function for matrix, but only vector. If you would like to improve the performance, you could try with multi-threading calculation by using 

#pragma omp parallel for
for (int i = 0; i < row; i++) {
    vdMul(col, a, b, y);
}

More physical core you have for your CPU, the higher performance you will get. Please also provide the testing performance with your hardware information to us, we will see if the performance acceptable. Thank you.

Best regards,
Fiona

0 Kudos
SergeyKostrov
Valued Contributor II
753 Views
>>...I have a simple task at hand. I want to compute an element-by-element multiplication of two matrices A and B. If your matrices are stored as vectors, that is as 1-D data sets, like double dMxA[ size ], dMxB[ size ], etc,, then vdMul needs to be used. 1-D representation of a matrix is very efficient ( contiguous memory blocks ).
0 Kudos
Mikhail_K_
Beginner
753 Views

Sergey Kostrov wrote:

>>...I have a simple task at hand. I want to compute an element-by-element multiplication of two matrices A and B.

If your matrices are stored as vectors, that is as 1-D data sets, like double dMxA[ size ], dMxB[ size ], etc,, then vdMul needs to be used.

1-D representation of a matrix is very efficient ( contiguous memory blocks ).

In this case you can even do hadamard with BLAS functions tbmv or sbmv. Both allow the input matrix to be stored in banded matrix form. With number of super-diagonals = 0 you only need to supply the diagonal, which in this case is one of the vectors

0 Kudos
Reply