Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

DSS symmetric solve

Dave_K_
Beginner
1,710 Views

We have been using the following code compiled with Intell Fortran on Linux clusters linked with MKL 10.2.2.025 for a number of years.  We are trying to update to Intel 15 compilers and MKL 11.2 update 2.  The symmetric solve produces wrong answers using the new compiler and libraries.  The code has not changed.  When we include the symmetric terms and solve it as a symmetric_structure matrix we get the right answers. 

Is any one else having problems with symmetric solves with the current compiler and library?

Load the data into indx, vall, and y_rt ...

      lcount = 0
      do row=1,n_solver
 
         pntr(row) = lcount +1 
          
         col = row 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = ad(row) 

         col = row + 1 
!         if( (col <= n*m).and.(au1(row) /= 0.)) then
         if( (col <= n*m) ) then
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au1(row) 
         endif 

         col = row - 1 + m 
!         if( (col <= n*m).and.(au2(row) /= 0.)) then 
         if( (col <= n*m) ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au2(row) 
         endif 

         col = row + m 
!         if( (col <= n*m).and.(au3(row) /= 0.)) then 
         if( (col <= n*m) ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au3(row) 
         endif 

         col = row + 1 + m 
!         if( (col <= n*m).and.(au4(row) /= 0.)) then 
         if( (col <= n*m) ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au4(row) 
         endif 

      end do

      ! Take care of last pntr
      pntr(n_solver+1) = lcount + 1 
      non_zeros=lcount

error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, &
                                       n_solver, indx, non_zeros )

error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm )

error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall )

nRhs=1 !number of right hand sides
error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt )

 

x_rt is correct under Intel 11/MKL 10.2.2 but wrong with intel 15/MKL 11.4 update 2

 

 

 

0 Kudos
15 Replies
Dave_K_
Beginner
1,710 Views

Update:  It also returns a correct solution for Intel 13.1.117.  So the problem is occurring somewhere between the Intel 13 and Intel 15 releases.

I also forgot to include the compilation and link options.  We do link in FFTW and which does have some MKL interfaces in it which may overlap the MKL library.

mpiifort -fpe0 -D_UNIX -r8 -O0 -I /theory/kellerd/shu_test/Draco_i2_5_iSNB_XBT_FPEOS_Blizzard/finclude -fpp -convert big_endian -I/opt/LLE/local/fftw-3.3.1/include -I/opt/LLE/local/fftw-3.3.1/api -I/opt/LLE/local/hdf5-1.8.2/include -I/opt/LLE/local/hdf5-1.8.2/lib -c 

mpiifort -o draco_hp15 hp15/*.o -fpe0 -D_UNIX  -Wl,--start-group -L/opt/LLE/local/fftw-3.3.1/lib -lfftw3  -liomp5 -lpthread -lz -lmpiif -lmpi -lmkl_core -lmkl_sequential -lmkl_intel_thread -lmkl_intel_lp64 -L/opt/LLE/local/hdf5-1.8.2/lib -lhdf5 -L/opt/LLE/local/hdf5-1.8.2/lib -lhdf5_fortran -Wl,--end-group

0 Kudos
mecej4
Honored Contributor III
1,710 Views

The DSS example code provided with MKL works correctly (checked with Matlab) with IFort 15.0.2 and MKL 11.2. I suggest that you try it on your system, and if it runs correctly, provide a reproducer based on your code that enables the problem to be seen.

0 Kudos
Gennady_F_Intel
Moderator
1,710 Views

yes, please check the problem with the latest version of MKL 11.2 update 2 and then in the case if you will see the problem is still exists, give us the example of this code which we can compile and debug this problem. You can take the eval version of MKL for this evaluation.

 

0 Kudos
Dave_K_
Beginner
1,710 Views

I developed a small test that demonstrates the error.  It creates the correct solution vector under Intel 13 and the incorrect solution under Intel 15 with MKL 11.2 update 2.  The input file is much smaller than our number array size but still is  623862720 bytes and attached.

>cat run
#!/bin/sh
# can be more optimized on newer architectures
mpiifort -fpe0 -r8 -O3 -xSSE4.2 -fpp -I${MKL_DIR}/include -L -L$MKL_LIB -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread MKL_DSS_radiation_DEBUGGING.f90
ldd a.out
./a.out

>cat out.13

        linux-vdso.so.1 =>  (0x00007fff5accc000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
        libmkl_intel_lp64.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b60b33f4000)
        libmkl_intel_thread.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b60b3b05000)
        libmkl_core.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_core.so (0x00002b60b4a99000)
        libiomp5.so => /opt/lic/intel13/composer_xe_2013.1.117/compiler/lib/intel64/libiomp5.so (0x00002b60b5ca8000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
        libmpi.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpi.so.4 (0x00002b60b5faa000)
        libmpigf.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpigf.so.4 (0x00002b60b65d0000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
 start solving the system
 TEST MKL DSS; Execution time=  0.2031250    
   50156093319.1081       -89969204211.7639        89514538131.7439     <<<<<<<< correct numbers

>cat out.15
        linux-vdso.so.1 =>  (0x00007fffbf1ff000)
        libmkl_intel_lp64.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b2d13f89000)
        libmkl_intel_thread.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b2d1489c000)
        libmkl_core.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.so (0x00002b2d15cbd000)
        libiomp5.so => /opt/lic/intel15/composer_xe_2015.2.164/compiler/lib/intel64/libiomp5.so (0x00002b2d1781c000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
        libmpifort.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpifort.so.12 (0x00002b2d17b6d000)
        libmpi.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpi.so.12 (0x00002b2d17df7000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
 start solving the system
 TEST MKL DSS; Execution time=  0.1875000    
   73375.2577664697       -359933.504809541        280913.663845399    <<<<<< incorrect numbers

>cat MKL_DSS_radiation_DEBUGGING.f90

INCLUDE 'mkl_dss.f90' ! Include the standard DSS "header file."
PROGRAM MKL_DSS_radiation_DEBUGGING
use mkl_dss
use ifport
      implicit none
!-----------------------------------------------
!   L o c a l   V a r i a b l e s
!-----------------------------------------------
      integer, parameter :: DP = KIND(1.0D0)
      integer, parameter        :: MAX_NZ_ROW = 9
      logical, save:: initialized=.FALSE.
      integer :: m_solver,n_solver
      integer :: n, m, row, col, lcount, non_zeros, error, perm(1),nRhs
      type(MKL_DSS_HANDLE),save :: handle ! Allocate storage for the solver handle.

      integer, allocatable, dimension(:), save:: indx, pntr
      real(KIND=DP),   allocatable, dimension(:), save:: vall,x_rt,y_rt
      real(KIND=DP) :: al4,al3,al2,al1,ad,au4,au3,au2,au1
      !real(8) :: t0, t1, MPI_WTIME
      real(4) :: t0, t1, ta(2)
      character*(10) :: suffix
      integer :: id, nthreads, omp_get_thread_num, omp_get_num_threads
      integer :: ierr
      real(8) :: x0
      integer(4) :: ic0,ic1,crate
!-----------------------------------------------

!      call MPI_INIT(ierr)
   !DSS implementation
   !  val   -  values
   !  pntr  -  rowIndex
   !  indx  -  colums
   !  ncols =  n_solver
   !  nrows =  ncols

      m_solver=60   !400  !400      !120 (for test)   !=je-js+1
      n_solver=2280 !200  !2000     !498 (for test)   !=ie-is+1
      n_solver=n_solver*m_solver
 
      m = m_solver
      n = n_solver/m_solver

      if(.NOT. initialized)then
         if( .NOT. allocated(indx) ) allocate(indx(MAX_NZ_ROW*n_solver))
         if( .NOT. allocated(pntr) ) allocate(pntr(n_solver+1))
         if( .NOT. allocated(vall) ) allocate(vall(MAX_NZ_ROW*n_solver))
         if( .NOT. allocated(x_rt) ) allocate(x_rt(n_solver))
         if( .NOT. allocated(y_rt) ) allocate(y_rt(n_solver))
      endif
      indx(:)=0
      pntr(:)=0
      vall(:)=0.0d0
      x_rt(:)=0.0d0
      y_rt(:)=0.0d0

open(555,FILE="input_ad_au_y_rt")

      lcount = 0
      do row=1,n_solver
 
         pntr(row) = lcount +1 
          
read(555,*) ad,au1,au2,au3,au4,y_rt(row)
! write(588,'(10(ES23.15,1X))') ad,au1,au2,au3,au4,y_rt(row)

         col = row 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = ad 

         col = row + 1 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au1
         endif 

         col = row - 1 + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au2
         endif 

         col = row + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au3 
         endif 

         col = row + 1 + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au4 
         endif

 


      end do
!
!     Take care of last pntr
!
      pntr(n_solver+1) = lcount + 1 
      non_zeros=lcount
      

      if (.NOT. initialized) then

         ! Initialize the solver.
         error = DSS_CREATE( handle, MKL_DSS_DEFAULTS )
         if (error /= MKL_DSS_SUCCESS) goto 999

         ! Define the non-zero structure of the matrix.
         error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, &
         & n_solver, indx, non_zeros )
         if (error /= MKL_DSS_SUCCESS) goto 999

         ! Reorder the matrix.
         error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm )
         if (error /= MKL_DSS_SUCCESS) goto 999

         initialized = .TRUE.
      endif
      
print*,"start solving the system"
call SYSTEM_CLOCK(ic0,crate)
t0 = dble(ic0)/dble(crate)
      !
      ! Factor the matrix.
      !
      error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall )
      if (error /= MKL_DSS_SUCCESS) goto 999

      !
      ! Solve for a vector x_rt, given an input vector y_rt as b
      !
      nRhs=1 !number of right hand sides
      error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt )
      if (error /= MKL_DSS_SUCCESS) goto 999

call SYSTEM_CLOCK(ic1,crate)
t1 = dble(ic1)/dble(crate)
print*,"TEST MKL DSS; Execution time=",t1-t0
print*,sum(x_rt),minval(x_rt),maxval(x_rt)

open(566,FILE="x_rt_based_on_input_"//TRIM(suffix))
do row=1,n_solver
write(566,*) x_rt(row)
enddo
stop

      999 WRITE(*,*) "Solver returned error code ", error
END PROGRAM MKL_DSS_radiation_DEBUGGING

 

0 Kudos
Dave_K_
Beginner
1,712 Views

Noticed my typo after I posted the example.  My note said "the input file is smaller than out number array size".  I meant to say it is smaller than our typical array sizes.  The attached input DOES demonstrate the problem

0 Kudos
mecej4
Honored Contributor III
1,712 Views

DSS lets you compute some statistics, among them the determinant. For your "small" matrix, I obtained this:

Determinant =  2.407 X 10**( -1057550)

Applying Pardiso to the same matrix, I obtained:

non-zeros in A:           683819
number of nonzeros in L  5914460
number of nonzeros in U        1

Something is seriously wrong if there is only one nonzero element in the U factor (I need to check if I have made an error in obtaining these results). In these circumstances, your summary digest of the solution (min, max and sum of elements of X) is not useful, probably not reproducible across platforms, and may even change with slightly different compiler options. There may be similar problems with your "large" matrices. Before accepting a solution, you should probably obtain the residuals after solving the equations. For the "small" matrix, I obtained residuals as large as 0.001.

Note that the data file containing the "small" matrix is large (> 600 MB) in formatted text form. By placing the data into an unformatted file and cutting out the portions that are not used in the calculation, you could reduce it to about 10 MB.

0 Kudos
Dave_K_
Beginner
1,712 Views

Thanks for looking at this for us.

I admit to really not knowing much about solvers but the issue is that the solution to an input set returns a solution that my physcis folks say is 'right' on Intel 13 is 'wrong' on Intel 15.  It would seem that if the sum, min and max of the solution differ significantly from an Intel 13 run to an Intel 15 run something significant has changed and that our 'right' answers have significantly changed with Intel 15.  It is not clear to me from your note if you think there is something 'seriously wrong' with the input or coding or from the result?

I modified the source to include additional statistics:

print*,sum(x_rt),minval(x_rt),maxval(x_rt)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,11,"determinant")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Determinant=",statistic(2)," X 10 ",statistic(1)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,7,"inertia")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"pos =",statistic(1)
print*,"neg =",statistic(2)
print*,"zer =",statistic(3)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,8,"solvemem")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Solvemem=",statistic(1)

Intel 13:

    50204591221.3627       -89969204206.9475        89514538126.9583     
 Determinant=   1.75565552825196       X 10   -1336812.00000000     
 pos =   67894.0000000000     
 neg =   68906.0000000000     
 zer =  0.000000000000000E+000
 Solvemem=   69469.0000000000     

 

Intel 15

   73375.2577673518       -359933.504809540        280913.663845398     
 Determinant=  -2.32535455034809       X 10   -1057541.00000000     
 pos =   65547.0000000000     
 neg =   71253.0000000000     
 zer =  0.000000000000000E+000
 Solvemem=   69867.0000000000  

0 Kudos
mecej4
Honored Contributor III
1,712 Views

You can further the testing if you can put together a smaller example, with as few unknowns as possible, but with the same structure as the matrices that you have shown, and approved by the "physics folks" as correct with IFort 13/MKL 10.2.2.  The problem with large size matrix problems is that you are restricted to fewer checks on correctness. On the other hand, when you whittle a problem down you may be left with an absurd remnant.

I shall be happy to share my verification code with you once I have done some checks.

Note this from #8:

Determinant=   1.76  X 10   -1336812

It is possible for a matrix to be well-conditioned and yet have a determinant nearly equal to 0. However, if the problem comes from a physics model and the variables are reasonably scaled this should not happen. Ask the physics folks if they are comfortable with this attoscopic value for the determinant. If this value is wrong because of a coding error, we can track that down and fix it. However, if the problem as formulated is nearly impossible to obtain an accurate solution for, the physics people may have a much tougher problem.

A more basic question: is it expected/reasonable that the matrix is singular? If not, there can be no zeroes on the principal diagonals of the L and U factors. Therefore, the minimum number of non-zeroes in L and U is equal to the number of equations.

0 Kudos
Dave_K_
Beginner
1,712 Views

The physics guys are not around today so I tried to do what I can to further help find this problem.  I cut the grid size down to a 14x30 grid so

m =8 n=112.  It still demostrates the problem.  The solution grid when plotted follows.  Note they show that the garray1 field on the left axis is "wrong" for 15, "right" for 13.  I also attached attached the input file (still in text format) for this smaller grid size. 451535.

 

 cat out.15_30
        linux-vdso.so.1 =>  (0x00007fff86871000)
        libmkl_intel_lp64.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b41f3e18000)
        libmkl_intel_thread.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b41f472b000)
        libmkl_core.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.so (0x00002b41f5b4c000)
        libiomp5.so => /opt/lic/intel15/composer_xe_2015.2.164/compiler/lib/intel64/libiomp5.so (0x00002b41f76ab000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
        libmpifort.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpifort.so.12 (0x00002b41f79fc000)
        libmpi.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpi.so.12 (0x00002b41f7c86000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
 start solving the system
 TEST MKL DSS; Execution time=  7.8125000E-02
   295618.772828955       -312701.448947056        220058.681258707     
 Determinant=  -1.85272122511410       X 10   -98200.0000000000     
 pos =   6727.00000000000     
 neg =   5873.00000000000     
 zer =  0.000000000000000E+000
 Solvemem=   5647.00000000000  

  cat out.13_30
        linux-vdso.so.1 =>  (0x00007fff7e4b4000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
        libmkl_intel_lp64.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002ac989820000)
        libmkl_intel_thread.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_thread.so (0x00002ac989f31000)
        libmkl_core.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_core.so (0x00002ac98aec5000)
        libiomp5.so => /opt/lic/intel13/composer_xe_2013.1.117/compiler/lib/intel64/libiomp5.so (0x00002ac98c0d4000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
        libmpi.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpi.so.4 (0x00002ac98c3d6000)
        libmpigf.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpigf.so.4 (0x00002ac98c9fc000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
 start solving the system
 TEST MKL DSS; Execution time=  1.5625000E-02
  -19306486435.1262       -5772347418.44442        3505444963.56421     
 Determinant=  -3.76569356065273       X 10   -117523.000000000     
 pos =   6477.00000000000     
 neg =   6123.00000000000     
 zer =  0.000000000000000E+000
 Solvemem=   5603.00000000000     

 

 

 

 

i15.pngi13.png

 

0 Kudos
mecej4
Honored Contributor III
1,712 Views

Do you have a writeup with the P.D.E.,boundary and initial conditions specified?

0 Kudos
Dave_K_
Beginner
1,712 Views

I just recognized today that I had sent in incorrect data input file.  I apologize for any wasted effort. The net is still the same however.  Intel 15 returns wrong answers.  I added a double check after the solve.  The solution returned when used to recompute the y_rt supplied has 9 errors:

Intel 13 output:

 start solving the system
 TEST MKL DSS; Execution time=  0.1109009    
  0.257042531294272       9.162025370382090E-014  4.008343651777284E-003
 Determinant=   2.12161563270491       X 10   -2990.00000000000     
 pos =   420.000000000000     
 neg =  0.000000000000000E+000
 zer =  0.000000000000000E+000
 Solvemem=   159.000000000000     

Intel 15 output:

 start solving the system
 TEST MKL DSS; Execution time=  7.7514648E-02
  0.251832820371750       9.162025183722943E-014  4.008343651777284E-003
 Determinant=   1.62949566754441       X 10   -2987.00000000000     
 pos =   420.000000000000     
 neg =  0.000000000000000E+000
 zer =  0.000000000000000E+000
 Solvemem=   262.000000000000     
 not equal         121  2.405238875359285E-019  2.065808031766965E-019                         <<<<<< double check should at least be close
 not equal         151  2.405238875359036E-019  2.068078684773001E-019
 not equal         181  2.405238875359422E-019  2.038278901649021E-019
 not equal         211  2.405238875359251E-019  1.844924519720349E-019
 not equal         241  2.405238875359413E-019 -6.431842696536678E-020
 not equal         331  3.299385792934408E-020 -7.117734061904493E-014
 not equal         332  9.889113991683395E-020 -6.104485185954297E-014
 not equal         361  3.303422135645092E-020 -1.172086994261061E-013
 not equal         391  8.509130524910173E-013  6.084941155961990E-013

A slightly modified source file and a new input file are provided.  They clearly demonstrate a problem and should be easily reproduced

INCLUDE 'mkl_dss.f90' ! Include the standard DSS "header file."
PROGRAM MKL_DSS_radiation_DEBUGGING
use mkl_dss
use ifport
      implicit none
!-----------------------------------------------
!   L o c a l   V a r i a b l e s
!-----------------------------------------------
      integer, parameter :: DP = 8 !KIND(1.0D0)
      integer, parameter        :: MAX_NZ_ROW = 9
      logical, save:: initialized=.FALSE.
      integer :: m_solver,n_solver
      integer :: n, m, row, col, lcount, non_zeros, error, nRhs
      type(MKL_DSS_HANDLE),save :: handle ! Allocate storage for the solver handle.

      integer, allocatable, dimension(:), save:: indx, pntr, perm
      real(KIND=DP),   allocatable, dimension(:), save:: vall,x_rt,y_rt,y_rt_check
      real(KIND=DP) :: al4,al3,al2,al1,ad,au4,au3,au2,au1
      !real(8) :: t0, t1, MPI_WTIME
      real(4) :: t0, t1, ta(2)
      character*(10) :: suffix
      integer :: id, nthreads, omp_get_thread_num, omp_get_num_threads
      integer :: ierr,statreq(10)
      real(8) :: x0, statistic(10)
      integer(4) :: ic0,ic1,crate
  
!-----------------------------------------------

!      call MPI_INIT(ierr)
   !DSS implementation
   !  val   -  values
   !  pntr  -  rowIndex
   !  indx  -  colums
   !  ncols =  n_solver
   !  nrows =  ncols

      m= 30  !60   !400  !400      !120 (for test)   !=je-js+1
      n= 14  !2280 !200  !2000     !498 (for test)   !=ie-is+1

      n_solver = n * m

      if(.NOT. initialized)then
         if( .NOT. allocated(indx) ) allocate(indx(MAX_NZ_ROW*n_solver))
         if( .NOT. allocated(pntr) ) allocate(pntr(n_solver+1))
         if( .NOT. allocated(vall) ) allocate(vall(MAX_NZ_ROW*n_solver))
         if( .NOT. allocated(x_rt) ) allocate(x_rt(n_solver))
         if( .NOT. allocated(y_rt) ) allocate(y_rt(n_solver))
         if( .NOT. allocated(y_rt_check) ) allocate(y_rt_check(n_solver))
         if( .NOT. allocated(perm)) allocate(perm(n_solver))
      endif
      indx(:)=0
      pntr(:)=0
      vall(:)=0.0d0
      x_rt(:)=0.0d0
      y_rt(:)=0.0d0

open(555,FILE="input_30x14")

      lcount = 0
      do row=1,n_solver
 
         pntr(row) = lcount +1 
          
read(555,*) ad,au1,au2,au3,au4,y_rt(row)
!+ashv write(588,'(10(ES23.15,1X))') ad,au1,au2,au3,au4,y_rt(row)

         col = row 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = ad 

         col = row + 1 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au1
         endif 

         col = row - 1 + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au2
         endif 

         col = row + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au3 
         endif 

         col = row + 1 + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au4 
         endif 

      end do
!
!     Take care of last pntr
!
      pntr(n_solver+1) = lcount + 1 
      non_zeros=lcount
      

      if (.NOT. initialized) then

         ! Initialize the solver.
         error = DSS_CREATE( handle, MKL_DSS_DEFAULTS )
         if (error /= MKL_DSS_SUCCESS) goto 999

         ! Define the non-zero structure of the matrix.
         error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, &
         & n_solver, indx, non_zeros )
         if (error /= MKL_DSS_SUCCESS) goto 999

         ! Reorder the matrix.
       !  error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm )
         error = DSS_REORDER( handle, MKL_DSS_AUTO_ORDER, perm )
         if (error /= MKL_DSS_SUCCESS) goto 999

         initialized = .TRUE.
      endif
      
print*,"start solving the system"
call SYSTEM_CLOCK(ic0,crate)
t0 = dble(ic0)/dble(crate)
      !
      ! Factor the matrix.
      !
      error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall )
      if (error /= MKL_DSS_SUCCESS) goto 999

      !
      ! Solve for a vector x_rt, given an input vector y_rt as b
      !
      nRhs=1 !number of right hand sides
      error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt )
      if (error /= MKL_DSS_SUCCESS) goto 999

call SYSTEM_CLOCK(ic1,crate)
t1 = dble(ic1)/dble(crate)
print*,"TEST MKL DSS; Execution time=",t1-t0
print*,sum(x_rt),minval(x_rt),maxval(x_rt)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,11,"determinant")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Determinant=",statistic(2)," X 10 ",statistic(1)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,7,"inertia")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"pos =",statistic(1)
print*,"neg =",statistic(2)
print*,"zer =",statistic(3)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,8,"solvemem")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Solvemem=",statistic(1)

      !then calculate the corresponding right-hand-side y_rt by multipying x_rt by the matrix storred in vall() array
      y_rt_check(:)=0.0d0
      do row=1,n_solver
          do lcount=pntr(row),pntr(row+1)-1
            col = indx(lcount)
            y_rt_check(row) = y_rt_check(row) + vall(lcount)*x_rt(col)
            if(col/=row) y_rt_check(col) = y_rt_check(col) + vall(lcount)*x_rt(row)
         enddo
      enddo




open(566,FILE="x_rt_based_on_input_"//TRIM(suffix))
do row=1,n_solver
write(566,*) x_rt(row)
if (abs(y_rt(row)-y_rt_check(row)) > abs(1.0e-6*y_rt(row))) print*,'not equal',row,y_rt(row), y_rt_check(row)
enddo
stop

      999 WRITE(*,*) "Solver returned error code ", error
END PROGRAM MKL_DSS_radiation_DEBUGGING

.

451632

 

 

 

0 Kudos
mecej4
Honored Contributor III
1,712 Views

I do not have the specific older version, but I think that you will reproduce the old results with the new compiler if, on line 133 of #5, you replace  MKL_DSS_INDEFINITE by MKL_DSS_DEFAULTS which, according to the current documentation, sets MKL_DLL_POSITIVE_DEFINITE.

Of course, you have to check whether this is a correct setting. You can also check your documentation for the old and new versions to see if the default value for the call to DSS_FACTOR changed with the version.

This may also indicate that, as you suspected, there is an error in the newer version of MKL. This is likely since the DSS calls are mapped into equivalent PARDISO calls, and an error in doing so may have been introduced recently. You may wish to consider rewriting the solver part of your code to make direct PARDISO calls instead of going through the DSS interface.

 

0 Kudos
Dave_K_
Beginner
1,712 Views

Looks like an error has been introduced.

Replacing MKL_DSS_INDEFINITE by MKL_DSS_DEFAULTS or MKL_DLL_POSITIVE_DEFINITE with Intel 15 results in:

 

start solving the system
 TEST MKL DSS; Execution time=  6.2500000E-02
  0.257042531294272       9.162025370385757E-014  4.008343651777285E-003
 Determinant=   2.12161563270509       X 10   -2990.00000000000     
MKL-DSS-UTIL-Fatal, Bad facility code
 pos =  -2990.00000000000     
 neg =   2.12161563270509     
 zer =  0.000000000000000E+000
 Solvemem=   262.000000000000     

 

0 Kudos
Dave_K_
Beginner
1,712 Views

Would you please share the equivalent PARDISO version of this code that works?

0 Kudos
mecej4
Honored Contributor III
1,712 Views

The zip file is attached. It contains the input file (from you, but included for consistency), a utility to read the input file and write an unformatted Fortran file in a format that I use when testing problems with MKL/Pardiso, and the program to read that unformated file and solve the problem with Pardiso. 

The utility is self-contained, and does not need to be linked with MKL. The solver program should be linked with MKL, and if you have not previously compiled the file mkl_pardiso.f90 that is provided with MKL, you should do so first to produce the module file that is needed by the solver program.

I have tested only on Windows 8.1 X64, but there should be no changes needed for Linux. Please let me know if you run into problems. The last line of the output should be 

       0.257042531294272       9.162025370381688E-014  4.008343651777284E-003

 

0 Kudos
Reply