inconsistency between mkl and matmul

luis_gc_rego · ‎01-27-2015

We have compared the results for matrix-vector recurrence relations of the type v_i = A*v_(i-1), as calculated by matmul and mkl routines.

After some iterations the calculations seem to diverge exponentially in most of the cases, but the outcome is machine dependent.

We have make comparisons for dgemv, dgemm, zgemm, and dzgemv. The problem occurs more frequently for the transpose multiplication case.

The tests were made on the following code.

program test_mkl

 implicit none

 real*8, allocatable :: v(:,:), vv(:,:)
 real*8, parameter   :: zero = 0.d0, &
                        one  = 1.d0, &
                        two  = 2.d0
 real*8,     allocatable :: A(:,:)
 integer                 :: n, order, i
 integer, dimension(4)   :: seed = (/0,0,0,1/)

 n = 3000
 order = 10

 allocate( v(n,order) , vv(n,order) )
 allocate( A(n,n) )

 ! generate random numbers in the interval [0,,1]
 call dlarnv( 1, seed, n*n, A )
 call dlarnv( 1, seed, n*order, vv )
 v = vv

 write(*,*) "diff increases exponentially for conjugate multiplication"
 do i = 2, order

     vv(:,i) = matmul( vv(:,i-1), A )

     call dgemv( 'T', n, n, one, A, n, vv(:,i-1), 1, zero, v(:,i), 1 )
 !    call dzgemv( 'T', n, n, z_two, A, n, vv(:,i-1), 1, z_zero, v(:), 1 )
 !    call dzgemm( 'T', 'N', n, 1, n, z_two, A, n, vv(:,i-1), n, z_zero, v(:), n )
 !    call zgemm( 'T', 'N', n, 1, n, z_two, A, n, v(:,i-1), n, z_zero, v(:,i), n )
 !    call dgemm( 'T', 'N', n, 1, n, two, A, n, v(:,i-1), n, zero, v(:,i), n )

     write(*,'(a,i2)'), "i = ", i
     write(*,*) "diff = ", maxval(abs( v(:,i)-vv(:,i) ))
 end do

 write(*,*) " "
 write(*,*) "diff for direct multiplication"
 do i = 2, order

     vv(:,i) = matmul( A , vv(:,i-1) )

     call dgemv( 'N', n, n, one, A, n, vv(:,i-1), 1, zero, v(:,i), 1 )
 !    call dzgemv( 'N', n, n, z_two, A, n, vv(:,i-1), 1, z_zero, v(:), 1 )
 !    call dzgemm( 'N', 'N', n, 1, n, z_two, A, n, vv(:,i-1), n, z_zero, v(:), n )
 !    call zgemm( 'N', 'N', n, 1, n, z_two, A, n, v(:,i-1), n, z_zero, v(:,i), n )
 !    call dgemm( 'N', 'N', n, 1, n, two, A, n, v(:,i-1), n, zero, v(:,i), n )

     write(*,'(a,i2)'), "i = ", i
     write(*,*) "diff = ", maxval(abs( v(:,i)-vv(:,i) ))
 end do

 end program

Below, there is a typical output for a XEON E5-2687W. What may be causing this type of problem?

diff increases exponentially for conjugate multiplication
i = 2
diff = 2.955857780762017E-012
i = 3
diff = 4.190951585769653E-009
i = 4
diff = 5.722045898437500E-006
i = 5
diff = 8.789062500000000E-003
i = 6
diff = 12.5000000000000
i = 7
diff = 18432.0000000000
i = 8
diff = 25165824.0000000
i = 9
diff = 42949672960.0000
i = 10
diff = 63771674411008.0

diff for direct multiplication
i = 2
diff = 0.000000000000000E+000
i = 3
diff = 0.000000000000000E+000
i = 4
diff = 0.000000000000000E+000
i = 5
diff = 0.000000000000000E+000
i = 6
diff = 0.000000000000000E+000
i = 7
diff = 0.000000000000000E+000
i = 8
diff = 0.000000000000000E+000
i = 9
diff = 0.000000000000000E+000

Zhang_Z_Intel · ‎01-27-2015

This is because round-off errors in floating point computation accumulate differently in different implementations. It's not uncommon to see different results of the same algorithm when using different math libraries. Even with the same library, running on different hardware, or the same hardware but different number of threads, can lead to different results because the algorithm may be parallelized in different ways, hence the completion order of the operation sequences may be different.

mecej4 · ‎01-27-2015

Estimating or evaluating the dominant eigenvalue of your matrix A may provide some insight into why the difference between the vector iterates from alternative computations diverge so rapidly. If you decide to undertake this calculation, however, I should discourage you from working with a random matrix, as such a matrix would have no similarity to the matrix in your real application.