topic This is because round-off in IntelĀ® oneAPI Math Kernel Library & IntelĀ® Math Kernel Library
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/inconsistency-between-mkl-and-matmul/m-p/1060420#M21671
<P>This is because round-off errors in floating point computation accumulate differently in different implementations. It's not uncommon to see different results of the same algorithm when using different math libraries. Even with the same library, running on different hardware, or the same hardware but different number of threads, can lead to different results because the algorithm may be parallelized in different ways, hence the completion order of the operation sequences may be different. </P>
<P> </P>Tue, 27 Jan 2015 22:43:38 GMTZhang_Z_Intel2015-01-27T22:43:38Zinconsistency between mkl and matmul
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/inconsistency-between-mkl-and-matmul/m-p/1060419#M21670
<P>We have compared the results for matrix-vector recurrence relations of the type v_i = A*v_(i-1), as calculated by matmul and mkl routines.</P>
<P>After some iterations the calculations seem to diverge exponentially in most of the cases, but the outcome is machine dependent.</P>
<P>We have make comparisons for dgemv, dgemm, zgemm, and dzgemv. The problem occurs more frequently for the transpose multiplication case.</P>
<P>The tests were made on the following code.</P>
<PRE class="brush:fortran;">program test_mkl
implicit none
real*8, allocatable :: v(:,:), vv(:,:)
real*8, parameter :: zero = 0.d0, &
one = 1.d0, &
two = 2.d0
real*8, allocatable :: A(:,:)
integer :: n, order, i
integer, dimension(4) :: seed = (/0,0,0,1/)
n = 3000
order = 10
allocate( v(n,order) , vv(n,order) )
allocate( A(n,n) )
! generate random numbers in the interval [0,,1]
call dlarnv( 1, seed, n*n, A )
call dlarnv( 1, seed, n*order, vv )
v = vv
write(*,*) "diff increases exponentially for conjugate multiplication"
do i = 2, order
vv(:,i) = matmul( vv(:,i-1), A )
call dgemv( 'T', n, n, one, A, n, vv(:,i-1), 1, zero, v(:,i), 1 )
! call dzgemv( 'T', n, n, z_two, A, n, vv(:,i-1), 1, z_zero, v(:), 1 )
! call dzgemm( 'T', 'N', n, 1, n, z_two, A, n, vv(:,i-1), n, z_zero, v(:), n )
! call zgemm( 'T', 'N', n, 1, n, z_two, A, n, v(:,i-1), n, z_zero, v(:,i), n )
! call dgemm( 'T', 'N', n, 1, n, two, A, n, v(:,i-1), n, zero, v(:,i), n )
write(*,'(a,i2)'), "i = ", i
write(*,*) "diff = ", maxval(abs( v(:,i)-vv(:,i) ))
end do
write(*,*) " "
write(*,*) "diff for direct multiplication"
do i = 2, order
vv(:,i) = matmul( A , vv(:,i-1) )
call dgemv( 'N', n, n, one, A, n, vv(:,i-1), 1, zero, v(:,i), 1 )
! call dzgemv( 'N', n, n, z_two, A, n, vv(:,i-1), 1, z_zero, v(:), 1 )
! call dzgemm( 'N', 'N', n, 1, n, z_two, A, n, vv(:,i-1), n, z_zero, v(:), n )
! call zgemm( 'N', 'N', n, 1, n, z_two, A, n, v(:,i-1), n, z_zero, v(:,i), n )
! call dgemm( 'N', 'N', n, 1, n, two, A, n, v(:,i-1), n, zero, v(:,i), n )
write(*,'(a,i2)'), "i = ", i
write(*,*) "diff = ", maxval(abs( v(:,i)-vv(:,i) ))
end do
end program</PRE>
<P> </P>
<P>Below, there is a typical output for a XEON E5-2687W. What may be causing this type of problem?</P>
<P> diff increases exponentially for conjugate multiplication<BR />
i = 2<BR />
diff = 2.955857780762017E-012<BR />
i = 3<BR />
diff = 4.190951585769653E-009<BR />
i = 4<BR />
diff = 5.722045898437500E-006<BR />
i = 5<BR />
diff = 8.789062500000000E-003<BR />
i = 6<BR />
diff = 12.5000000000000<BR />
i = 7<BR />
diff = 18432.0000000000<BR />
i = 8<BR />
diff = 25165824.0000000<BR />
i = 9<BR />
diff = 42949672960.0000<BR />
i = 10<BR />
diff = 63771674411008.0<BR />
<BR />
diff for direct multiplication<BR />
i = 2<BR />
diff = 0.000000000000000E+000<BR />
i = 3<BR />
diff = 0.000000000000000E+000<BR />
i = 4<BR />
diff = 0.000000000000000E+000<BR />
i = 5<BR />
diff = 0.000000000000000E+000<BR />
i = 6<BR />
diff = 0.000000000000000E+000<BR />
i = 7<BR />
diff = 0.000000000000000E+000<BR />
i = 8<BR />
diff = 0.000000000000000E+000<BR />
i = 9<BR />
diff = 0.000000000000000E+000</P>
<P> </P>
<P> </P>Tue, 27 Jan 2015 21:11:31 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/inconsistency-between-mkl-and-matmul/m-p/1060419#M21670luis_gc_rego2015-01-27T21:11:31ZThis is because round-off
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/inconsistency-between-mkl-and-matmul/m-p/1060420#M21671
<P>This is because round-off errors in floating point computation accumulate differently in different implementations. It's not uncommon to see different results of the same algorithm when using different math libraries. Even with the same library, running on different hardware, or the same hardware but different number of threads, can lead to different results because the algorithm may be parallelized in different ways, hence the completion order of the operation sequences may be different. </P>
<P> </P>Tue, 27 Jan 2015 22:43:38 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/inconsistency-between-mkl-and-matmul/m-p/1060420#M21671Zhang_Z_Intel2015-01-27T22:43:38ZEstimating or evaluating the
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/inconsistency-between-mkl-and-matmul/m-p/1060421#M21672
<P>Estimating or evaluating the dominant eigenvalue of your matrix A may provide some insight into why the difference between the vector iterates from alternative computations diverge so rapidly. If you decide to undertake this calculation, however, I should discourage you from working with a random matrix, as such a matrix would have no similarity to the matrix in your real application.</P>Tue, 27 Jan 2015 23:57:16 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/inconsistency-between-mkl-and-matmul/m-p/1060421#M21672mecej42015-01-27T23:57:16Z