I am trying to do some element-wise multiplication between two 3D arrays/matrices and then calculate the sum of all the elements returned. Something like this:
res = 0 do k = kstart,kend do j = jstart, jend do i = istart, iend res = res + A(i, j, k) * B(i, j ,k) end do end do end do
However, such multiplication is not applied to all the elements in A or B but the elements that were involved between the iteration variables. To make the calculation faster, I tried the ddot function in MKL like this:
n = (kend - kstart + 1) * (jend - jstart + 1) * (iend - istart + 1) res = ddot(n, A(istart:iend, jstart:jend, kstart:kend), 1, B(istart:iend, jstart:jend, kstart:kend), 1)
But in this way I couldn't get the same result as I did by using three do-loops.
Anyone who might tell me where the problem is?
Lin, JH wrote:
However, such multiplication is not applied to all the elements in A or B but the elements that were involved between the iteration variables.
Please explain what you mean by the sentence above. What iterations? What does "involved between" mean?
The codes that you showed above appear intended to multiply and sum ALL elements within the limits istart:iend, etc. Please reconcile these codes with what you wrote regarding applying the summation to a selected subset.
It is almost certain that ddot() is going to add up the elements of the dot product in a different order than your loops. So you'll get a different answer due to accumulation of errors due to the limited precision of floating point arithmetic. How different will depend on the relative magnitudes of the values being added. Your code may also give different answers depending on what compiler and optimization settings you use, or on what generation of processor you run it on.