Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

How to do an element-wise multiplication between two 3D-arrays?

Lin__JH
Beginner
834 Views

Hi, everyone.

I am trying to do some element-wise multiplication between two 3D arrays/matrices and then calculate the sum of all the elements returned. Something like this:

res = 0
do k = kstart,kend
do j = jstart, jend
do i = istart, iend
res = res + A(i, j, k) * B(i, j ,k)
end do 
end do
end do

However, such multiplication is not applied to all the elements in A or B but the elements that were involved between the iteration variables. To make the calculation faster, I tried the ddot function in MKL like this:

n = (kend - kstart + 1) * (jend - jstart + 1) * (iend - istart + 1)
res = ddot(n, A(istart:iend, jstart:jend, kstart:kend), 1, B(istart:iend, jstart:jend, kstart:kend), 1)

But in this way I couldn't get the same result as I did by using three do-loops.

Anyone who might tell me where the problem is?

Thanks :)

 

0 Kudos
2 Replies
mecej4
Honored Contributor III
834 Views

Lin, JH wrote:
 However, such multiplication is not applied to all the elements in A or B but the elements that were involved between the iteration variables.

Please explain what you mean by the sentence above. What iterations? What does "involved between" mean?

The codes that you showed above appear intended to multiply and sum ALL elements within the limits istart:iend, etc. Please reconcile these codes with what you wrote regarding applying the summation to a selected subset.

0 Kudos
Hinds__David
Beginner
834 Views

It is almost certain that ddot() is going to add up the elements of the dot product in a different order than your loops. So you'll get a different answer due to accumulation of errors due to the limited precision of floating point arithmetic. How different will depend on the relative magnitudes of the values being added. Your code may also give different answers depending on what compiler and optimization settings you use, or on what generation of processor you run it on.

0 Kudos
Reply