Thanks for looking at this

Dave_K_ · ‎03-04-2015

We have been using the following code compiled with Intell Fortran on Linux clusters linked with MKL 10.2.2.025 for a number of years. We are trying to update to Intel 15 compilers and MKL 11.2 update 2. The symmetric solve produces wrong answers using the new compiler and libraries. The code has not changed. When we include the symmetric terms and solve it as a symmetric_structure matrix we get the right answers.

Is any one else having problems with symmetric solves with the current compiler and library?

Load the data into indx, vall, and y_rt ...

lcount = 0
do row=1,n_solver

pntr(row) = lcount +1

col = row
lcount = lcount+1
indx(lcount) = col
vall(lcount) = ad(row)

col = row + 1
! if( (col <= n*m).and.(au1(row) /= 0.)) then
if( (col <= n*m) ) then
lcount = lcount+1
indx(lcount) = col
vall(lcount) = au1(row)
endif

col = row - 1 + m
! if( (col <= n*m).and.(au2(row) /= 0.)) then
if( (col <= n*m) ) then
lcount = lcount+1
indx(lcount) = col
vall(lcount) = au2(row)
endif

col = row + m
! if( (col <= n*m).and.(au3(row) /= 0.)) then
if( (col <= n*m) ) then
lcount = lcount+1
indx(lcount) = col
vall(lcount) = au3(row)
endif

col = row + 1 + m
! if( (col <= n*m).and.(au4(row) /= 0.)) then
if( (col <= n*m) ) then
lcount = lcount+1
indx(lcount) = col
vall(lcount) = au4(row)
endif

end do

! Take care of last pntr
pntr(n_solver+1) = lcount + 1
non_zeros=lcount

error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, &
n_solver, indx, non_zeros )

error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm )

error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall )

nRhs=1 !number of right hand sides
error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt )

x_rt is correct under Intel 11/MKL 10.2.2 but wrong with intel 15/MKL 11.4 update 2

Dave_K_ · ‎03-04-2015

Update: It also returns a correct solution for Intel 13.1.117. So the problem is occurring somewhere between the Intel 13 and Intel 15 releases.

I also forgot to include the compilation and link options. We do link in FFTW and which does have some MKL interfaces in it which may overlap the MKL library.

mpiifort -fpe0 -D_UNIX -r8 -O0 -I /theory/kellerd/shu_test/Draco_i2_5_iSNB_XBT_FPEOS_Blizzard/finclude -fpp -convert big_endian -I/opt/LLE/local/fftw-3.3.1/include -I/opt/LLE/local/fftw-3.3.1/api -I/opt/LLE/local/hdf5-1.8.2/include -I/opt/LLE/local/hdf5-1.8.2/lib -c

mpiifort -o draco_hp15 hp15/*.o -fpe0 -D_UNIX -Wl,--start-group -L/opt/LLE/local/fftw-3.3.1/lib -lfftw3 -liomp5 -lpthread -lz -lmpiif -lmpi -lmkl_core -lmkl_sequential -lmkl_intel_thread -lmkl_intel_lp64 -L/opt/LLE/local/hdf5-1.8.2/lib -lhdf5 -L/opt/LLE/local/hdf5-1.8.2/lib -lhdf5_fortran -Wl,--end-group

mecej4 · ‎03-04-2015

The DSS example code provided with MKL works correctly (checked with Matlab) with IFort 15.0.2 and MKL 11.2. I suggest that you try it on your system, and if it runs correctly, provide a reproducer based on your code that enables the problem to be seen.

Gennady_F_Intel · ‎03-04-2015

yes, please check the problem with the latest version of MKL 11.2 update 2 and then in the case if you will see the problem is still exists, give us the example of this code which we can compile and debug this problem. You can take the eval version of MKL for this evaluation.

Dave_K_ · ‎03-05-2015

I developed a small test that demonstrates the error. It creates the correct solution vector under Intel 13 and the incorrect solution under Intel 15 with MKL 11.2 update 2. The input file is much smaller than our number array size but still is 623862720 bytes and attached.

>cat run
#!/bin/sh
# can be more optimized on newer architectures
mpiifort -fpe0 -r8 -O3 -xSSE4.2 -fpp -I${MKL_DIR}/include -L -L$MKL_LIB -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread MKL_DSS_radiation_DEBUGGING.f90
ldd a.out
./a.out

>cat out.13

linux-vdso.so.1 => (0x00007fff5accc000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
libmkl_intel_lp64.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b60b33f4000)
libmkl_intel_thread.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b60b3b05000)
libmkl_core.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_core.so (0x00002b60b4a99000)
libiomp5.so => /opt/lic/intel13/composer_xe_2013.1.117/compiler/lib/intel64/libiomp5.so (0x00002b60b5ca8000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
libmpi.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpi.so.4 (0x00002b60b5faa000)
libmpigf.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpigf.so.4 (0x00002b60b65d0000)
librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
start solving the system
TEST MKL DSS; Execution time= 0.2031250
50156093319.1081 -89969204211.7639 89514538131.7439 <<<<<<<< correct numbers

>cat out.15
linux-vdso.so.1 => (0x00007fffbf1ff000)
libmkl_intel_lp64.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b2d13f89000)
libmkl_intel_thread.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b2d1489c000)
libmkl_core.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.so (0x00002b2d15cbd000)
libiomp5.so => /opt/lic/intel15/composer_xe_2015.2.164/compiler/lib/intel64/libiomp5.so (0x00002b2d1781c000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
libmpifort.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpifort.so.12 (0x00002b2d17b6d000)
libmpi.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpi.so.12 (0x00002b2d17df7000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
start solving the system
TEST MKL DSS; Execution time= 0.1875000
73375.2577664697 -359933.504809541 280913.663845399 <<<<<< incorrect numbers

>cat MKL_DSS_radiation_DEBUGGING.f90

INCLUDE 'mkl_dss.f90' ! Include the standard DSS "header file."
PROGRAM MKL_DSS_radiation_DEBUGGING
use mkl_dss
use ifport
      implicit none
!-----------------------------------------------
!   L o c a l   V a r i a b l e s
!-----------------------------------------------
      integer, parameter :: DP = KIND(1.0D0)
      integer, parameter        :: MAX_NZ_ROW = 9
      logical, save:: initialized=.FALSE.
      integer :: m_solver,n_solver
      integer :: n, m, row, col, lcount, non_zeros, error, perm(1),nRhs
      type(MKL_DSS_HANDLE),save :: handle ! Allocate storage for the solver handle.

      integer, allocatable, dimension(:), save:: indx, pntr
      real(KIND=DP),   allocatable, dimension(:), save:: vall,x_rt,y_rt
      real(KIND=DP) :: al4,al3,al2,al1,ad,au4,au3,au2,au1
      !real(8) :: t0, t1, MPI_WTIME
      real(4) :: t0, t1, ta(2)
      character*(10) :: suffix
      integer :: id, nthreads, omp_get_thread_num, omp_get_num_threads
      integer :: ierr
      real(8) :: x0
      integer(4) :: ic0,ic1,crate
!-----------------------------------------------

!      call MPI_INIT(ierr)
   !DSS implementation
   !  val   -  values
   !  pntr  -  rowIndex
   !  indx  -  colums
   !  ncols =  n_solver
   !  nrows =  ncols

      m_solver=60   !400  !400      !120 (for test)   !=je-js+1
      n_solver=2280 !200  !2000     !498 (for test)   !=ie-is+1
      n_solver=n_solver*m_solver
 
      m = m_solver
      n = n_solver/m_solver

      if(.NOT. initialized)then
         if( .NOT. allocated(indx) ) allocate(indx(MAX_NZ_ROW*n_solver))
         if( .NOT. allocated(pntr) ) allocate(pntr(n_solver+1))
         if( .NOT. allocated(vall) ) allocate(vall(MAX_NZ_ROW*n_solver))
         if( .NOT. allocated(x_rt) ) allocate(x_rt(n_solver))
         if( .NOT. allocated(y_rt) ) allocate(y_rt(n_solver))
      endif
      indx(:)=0
      pntr(:)=0
      vall(:)=0.0d0
      x_rt(:)=0.0d0
      y_rt(:)=0.0d0

open(555,FILE="input_ad_au_y_rt")

      lcount = 0
      do row=1,n_solver
 
         pntr(row) = lcount +1 
          
read(555,*) ad,au1,au2,au3,au4,y_rt(row)
! write(588,'(10(ES23.15,1X))') ad,au1,au2,au3,au4,y_rt(row)

         col = row 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = ad 

         col = row + 1 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au1
         endif 

         col = row - 1 + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au2
         endif 

         col = row + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au3 
         endif 

         col = row + 1 + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au4 
         endif

 


      end do
!
!     Take care of last pntr
!
      pntr(n_solver+1) = lcount + 1 
      non_zeros=lcount
      

      if (.NOT. initialized) then

         ! Initialize the solver.
         error = DSS_CREATE( handle, MKL_DSS_DEFAULTS )
         if (error /= MKL_DSS_SUCCESS) goto 999

         ! Define the non-zero structure of the matrix.
         error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, &
         & n_solver, indx, non_zeros )
         if (error /= MKL_DSS_SUCCESS) goto 999

         ! Reorder the matrix.
         error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm )
         if (error /= MKL_DSS_SUCCESS) goto 999

         initialized = .TRUE.
      endif
      
print*,"start solving the system"
call SYSTEM_CLOCK(ic0,crate)
t0 = dble(ic0)/dble(crate)
      !
      ! Factor the matrix.
      !
      error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall )
      if (error /= MKL_DSS_SUCCESS) goto 999

      !
      ! Solve for a vector x_rt, given an input vector y_rt as b
      !
      nRhs=1 !number of right hand sides
      error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt )
      if (error /= MKL_DSS_SUCCESS) goto 999

call SYSTEM_CLOCK(ic1,crate)
t1 = dble(ic1)/dble(crate)
print*,"TEST MKL DSS; Execution time=",t1-t0
print*,sum(x_rt),minval(x_rt),maxval(x_rt)

open(566,FILE="x_rt_based_on_input_"//TRIM(suffix))
do row=1,n_solver
write(566,*) x_rt(row)
enddo
stop

      999 WRITE(*,*) "Solver returned error code ", error
END PROGRAM MKL_DSS_radiation_DEBUGGING

Dave_K_ · ‎03-05-2015

Noticed my typo after I posted the example. My note said "the input file is smaller than out number array size". I meant to say it is smaller than our typical array sizes. The attached input DOES demonstrate the problem

mecej4 · ‎03-06-2015

DSS lets you compute some statistics, among them the determinant. For your "small" matrix, I obtained this:

Determinant =  2.407 X 10**( -1057550)

Applying Pardiso to the same matrix, I obtained:

non-zeros in A:           683819
number of nonzeros in L  5914460
number of nonzeros in U        1

Something is seriously wrong if there is only one nonzero element in the U factor (I need to check if I have made an error in obtaining these results). In these circumstances, your summary digest of the solution (min, max and sum of elements of X) is not useful, probably not reproducible across platforms, and may even change with slightly different compiler options. There may be similar problems with your "large" matrices. Before accepting a solution, you should probably obtain the residuals after solving the equations. For the "small" matrix, I obtained residuals as large as 0.001.

Note that the data file containing the "small" matrix is large (> 600 MB) in formatted text form. By placing the data into an unformatted file and cutting out the portions that are not used in the calculation, you could reduce it to about 10 MB.

Dave_K_ · ‎03-06-2015

Thanks for looking at this for us.

I admit to really not knowing much about solvers but the issue is that the solution to an input set returns a solution that my physcis folks say is 'right' on Intel 13 is 'wrong' on Intel 15. It would seem that if the sum, min and max of the solution differ significantly from an Intel 13 run to an Intel 15 run something significant has changed and that our 'right' answers have significantly changed with Intel 15. It is not clear to me from your note if you think there is something 'seriously wrong' with the input or coding or from the result?

I modified the source to include additional statistics:

print*,sum(x_rt),minval(x_rt),maxval(x_rt)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,11,"determinant")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Determinant=",statistic(2)," X 10 ",statistic(1)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,7,"inertia")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"pos =",statistic(1)
print*,"neg =",statistic(2)
print*,"zer =",statistic(3)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,8,"solvemem")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Solvemem=",statistic(1)

Intel 13:

   50204591221.3627 -89969204206.9475 89514538126.9583
Determinant= 1.75565552825196 X 10 -1336812.00000000
pos = 67894.0000000000
neg = 68906.0000000000
zer = 0.000000000000000E+000
Solvemem= 69469.0000000000

Intel 15

73375.2577673518 -359933.504809540 280913.663845398
Determinant= -2.32535455034809 X 10 -1057541.00000000
pos = 65547.0000000000
neg = 71253.0000000000
zer = 0.000000000000000E+000
Solvemem= 69867.0000000000

mecej4 · ‎03-06-2015

You can further the testing if you can put together a smaller example, with as few unknowns as possible, but with the same structure as the matrices that you have shown, and approved by the "physics folks" as correct with IFort 13/MKL 10.2.2. The problem with large size matrix problems is that you are restricted to fewer checks on correctness. On the other hand, when you whittle a problem down you may be left with an absurd remnant.

I shall be happy to share my verification code with you once I have done some checks.

Note this from #8:

Determinant= 1.76 X 10 ^-1336812

It is possible for a matrix to be well-conditioned and yet have a determinant nearly equal to 0. However, if the problem comes from a physics model and the variables are reasonably scaled this should not happen. Ask the physics folks if they are comfortable with this attoscopic value for the determinant. If this value is wrong because of a coding error, we can track that down and fix it. However, if the problem as formulated is nearly impossible to obtain an accurate solution for, the physics people may have a much tougher problem.

A more basic question: is it expected/reasonable that the matrix is singular? If not, there can be no zeroes on the principal diagonals of the L and U factors. Therefore, the minimum number of non-zeroes in L and U is equal to the number of equations.

Dave_K_ · ‎03-06-2015

The physics guys are not around today so I tried to do what I can to further help find this problem. I cut the grid size down to a 14x30 grid so

m =8 n=112. It still demostrates the problem. The solution grid when plotted follows. Note they show that the garray1 field on the left axis is "wrong" for 15, "right" for 13. I also attached attached the input file (still in text format) for this smaller grid size. 451535.

cat out.15_30
linux-vdso.so.1 => (0x00007fff86871000)
libmkl_intel_lp64.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b41f3e18000)
libmkl_intel_thread.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b41f472b000)
libmkl_core.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.so (0x00002b41f5b4c000)
libiomp5.so => /opt/lic/intel15/composer_xe_2015.2.164/compiler/lib/intel64/libiomp5.so (0x00002b41f76ab000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
libmpifort.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpifort.so.12 (0x00002b41f79fc000)
libmpi.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpi.so.12 (0x00002b41f7c86000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
start solving the system
TEST MKL DSS; Execution time= 7.8125000E-02
295618.772828955 -312701.448947056 220058.681258707
Determinant= -1.85272122511410 X 10 -98200.0000000000
pos = 6727.00000000000
neg = 5873.00000000000
zer = 0.000000000000000E+000
Solvemem= 5647.00000000000

  cat out.13_30
linux-vdso.so.1 => (0x00007fff7e4b4000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
libmkl_intel_lp64.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002ac989820000)
libmkl_intel_thread.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_thread.so (0x00002ac989f31000)
libmkl_core.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_core.so (0x00002ac98aec5000)
libiomp5.so => /opt/lic/intel13/composer_xe_2013.1.117/compiler/lib/intel64/libiomp5.so (0x00002ac98c0d4000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
libmpi.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpi.so.4 (0x00002ac98c3d6000)
libmpigf.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpigf.so.4 (0x00002ac98c9fc000)
librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
start solving the system
TEST MKL DSS; Execution time= 1.5625000E-02
-19306486435.1262 -5772347418.44442 3505444963.56421
Determinant= -3.76569356065273 X 10 -117523.000000000
pos = 6477.00000000000
neg = 6123.00000000000
zer = 0.000000000000000E+000
Solvemem= 5603.00000000000

mecej4 · ‎03-06-2015

Do you have a writeup with the P.D.E.,boundary and initial conditions specified?

Dave_K_ · ‎03-09-2015

I just recognized today that I had sent in incorrect data input file. I apologize for any wasted effort. The net is still the same however. Intel 15 returns wrong answers. I added a double check after the solve. The solution returned when used to recompute the y_rt supplied has 9 errors:

Intel 13 output:

start solving the system
TEST MKL DSS; Execution time= 0.1109009
0.257042531294272 9.162025370382090E-014 4.008343651777284E-003
Determinant= 2.12161563270491 X 10 -2990.00000000000
pos = 420.000000000000
neg = 0.000000000000000E+000
zer = 0.000000000000000E+000
Solvemem= 159.000000000000

Intel 15 output:

start solving the system
TEST MKL DSS; Execution time= 7.7514648E-02
0.251832820371750 9.162025183722943E-014 4.008343651777284E-003
Determinant= 1.62949566754441 X 10 -2987.00000000000
pos = 420.000000000000
neg = 0.000000000000000E+000
zer = 0.000000000000000E+000
Solvemem= 262.000000000000
not equal 121 2.405238875359285E-019 2.065808031766965E-019 <<<<<< double check should at least be close
not equal 151 2.405238875359036E-019 2.068078684773001E-019
not equal 181 2.405238875359422E-019 2.038278901649021E-019
not equal 211 2.405238875359251E-019 1.844924519720349E-019
not equal 241 2.405238875359413E-019 -6.431842696536678E-020
not equal 331 3.299385792934408E-020 -7.117734061904493E-014
not equal 332 9.889113991683395E-020 -6.104485185954297E-014
not equal 361 3.303422135645092E-020 -1.172086994261061E-013
not equal 391 8.509130524910173E-013 6.084941155961990E-013

A slightly modified source file and a new input file are provided. They clearly demonstrate a problem and should be easily reproduced

INCLUDE 'mkl_dss.f90' ! Include the standard DSS "header file."
PROGRAM MKL_DSS_radiation_DEBUGGING
use mkl_dss
use ifport
      implicit none
!-----------------------------------------------
!   L o c a l   V a r i a b l e s
!-----------------------------------------------
      integer, parameter :: DP = 8 !KIND(1.0D0)
      integer, parameter        :: MAX_NZ_ROW = 9
      logical, save:: initialized=.FALSE.
      integer :: m_solver,n_solver
      integer :: n, m, row, col, lcount, non_zeros, error, nRhs
      type(MKL_DSS_HANDLE),save :: handle ! Allocate storage for the solver handle.

      integer, allocatable, dimension(:), save:: indx, pntr, perm
      real(KIND=DP),   allocatable, dimension(:), save:: vall,x_rt,y_rt,y_rt_check
      real(KIND=DP) :: al4,al3,al2,al1,ad,au4,au3,au2,au1
      !real(8) :: t0, t1, MPI_WTIME
      real(4) :: t0, t1, ta(2)
      character*(10) :: suffix
      integer :: id, nthreads, omp_get_thread_num, omp_get_num_threads
      integer :: ierr,statreq(10)
      real(8) :: x0, statistic(10)
      integer(4) :: ic0,ic1,crate
  
!-----------------------------------------------

!      call MPI_INIT(ierr)
   !DSS implementation
   !  val   -  values
   !  pntr  -  rowIndex
   !  indx  -  colums
   !  ncols =  n_solver
   !  nrows =  ncols

      m= 30  !60   !400  !400      !120 (for test)   !=je-js+1
      n= 14  !2280 !200  !2000     !498 (for test)   !=ie-is+1

      n_solver = n * m

      if(.NOT. initialized)then
         if( .NOT. allocated(indx) ) allocate(indx(MAX_NZ_ROW*n_solver))
         if( .NOT. allocated(pntr) ) allocate(pntr(n_solver+1))
         if( .NOT. allocated(vall) ) allocate(vall(MAX_NZ_ROW*n_solver))
         if( .NOT. allocated(x_rt) ) allocate(x_rt(n_solver))
         if( .NOT. allocated(y_rt) ) allocate(y_rt(n_solver))
         if( .NOT. allocated(y_rt_check) ) allocate(y_rt_check(n_solver))
         if( .NOT. allocated(perm)) allocate(perm(n_solver))
      endif
      indx(:)=0
      pntr(:)=0
      vall(:)=0.0d0
      x_rt(:)=0.0d0
      y_rt(:)=0.0d0

open(555,FILE="input_30x14")

      lcount = 0
      do row=1,n_solver
 
         pntr(row) = lcount +1 
          
read(555,*) ad,au1,au2,au3,au4,y_rt(row)
!+ashv write(588,'(10(ES23.15,1X))') ad,au1,au2,au3,au4,y_rt(row)

         col = row 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = ad 

         col = row + 1 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au1
         endif 

         col = row - 1 + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au2
         endif 

         col = row + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au3 
         endif 

         col = row + 1 + m 
         if( col <= n*m ) then 
            lcount = lcount+1 
            indx(lcount) = col
            vall(lcount) = au4 
         endif 

      end do
!
!     Take care of last pntr
!
      pntr(n_solver+1) = lcount + 1 
      non_zeros=lcount
      

      if (.NOT. initialized) then

         ! Initialize the solver.
         error = DSS_CREATE( handle, MKL_DSS_DEFAULTS )
         if (error /= MKL_DSS_SUCCESS) goto 999

         ! Define the non-zero structure of the matrix.
         error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, &
         & n_solver, indx, non_zeros )
         if (error /= MKL_DSS_SUCCESS) goto 999

         ! Reorder the matrix.
       !  error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm )
         error = DSS_REORDER( handle, MKL_DSS_AUTO_ORDER, perm )
         if (error /= MKL_DSS_SUCCESS) goto 999

         initialized = .TRUE.
      endif
      
print*,"start solving the system"
call SYSTEM_CLOCK(ic0,crate)
t0 = dble(ic0)/dble(crate)
      !
      ! Factor the matrix.
      !
      error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall )
      if (error /= MKL_DSS_SUCCESS) goto 999

      !
      ! Solve for a vector x_rt, given an input vector y_rt as b
      !
      nRhs=1 !number of right hand sides
      error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt )
      if (error /= MKL_DSS_SUCCESS) goto 999

call SYSTEM_CLOCK(ic1,crate)
t1 = dble(ic1)/dble(crate)
print*,"TEST MKL DSS; Execution time=",t1-t0
print*,sum(x_rt),minval(x_rt),maxval(x_rt)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,11,"determinant")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Determinant=",statistic(2)," X 10 ",statistic(1)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,7,"inertia")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"pos =",statistic(1)
print*,"neg =",statistic(2)
print*,"zer =",statistic(3)

call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,8,"solvemem")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Solvemem=",statistic(1)

      !then calculate the corresponding right-hand-side y_rt by multipying x_rt by the matrix storred in vall() array
      y_rt_check(:)=0.0d0
      do row=1,n_solver
          do lcount=pntr(row),pntr(row+1)-1
            col = indx(lcount)
            y_rt_check(row) = y_rt_check(row) + vall(lcount)*x_rt(col)
            if(col/=row) y_rt_check(col) = y_rt_check(col) + vall(lcount)*x_rt(row)
         enddo
      enddo




open(566,FILE="x_rt_based_on_input_"//TRIM(suffix))
do row=1,n_solver
write(566,*) x_rt(row)
if (abs(y_rt(row)-y_rt_check(row)) > abs(1.0e-6*y_rt(row))) print*,'not equal',row,y_rt(row), y_rt_check(row)
enddo
stop

      999 WRITE(*,*) "Solver returned error code ", error
END PROGRAM MKL_DSS_radiation_DEBUGGING

.

451632

mecej4 · ‎03-09-2015

I do not have the specific older version, but I think that you will reproduce the old results with the new compiler if, on line 133 of #5, you replace MKL_DSS_INDEFINITE by MKL_DSS_DEFAULTS which, according to the current documentation, sets MKL_DLL_POSITIVE_DEFINITE.

Of course, you have to check whether this is a correct setting. You can also check your documentation for the old and new versions to see if the default value for the call to DSS_FACTOR changed with the version.

This may also indicate that, as you suspected, there is an error in the newer version of MKL. This is likely since the DSS calls are mapped into equivalent PARDISO calls, and an error in doing so may have been introduced recently. You may wish to consider rewriting the solver part of your code to make direct PARDISO calls instead of going through the DSS interface.

Dave_K_ · ‎03-10-2015

Looks like an error has been introduced.

Replacing MKL_DSS_INDEFINITE by MKL_DSS_DEFAULTS or MKL_DLL_POSITIVE_DEFINITE with Intel 15 results in:

start solving the system
TEST MKL DSS; Execution time= 6.2500000E-02
0.257042531294272 9.162025370385757E-014 4.008343651777285E-003
Determinant= 2.12161563270509 X 10 -2990.00000000000
MKL-DSS-UTIL-Fatal, Bad facility code
pos = -2990.00000000000
neg = 2.12161563270509
zer = 0.000000000000000E+000
Solvemem= 262.000000000000

Dave_K_ · ‎03-10-2015

Would you please share the equivalent PARDISO version of this code that works?

mecej4 · ‎03-10-2015

The zip file is attached. It contains the input file (from you, but included for consistency), a utility to read the input file and write an unformatted Fortran file in a format that I use when testing problems with MKL/Pardiso, and the program to read that unformated file and solve the problem with Pardiso.

The utility is self-contained, and does not need to be linked with MKL. The solver program should be linked with MKL, and if you have not previously compiled the file mkl_pardiso.f90 that is provided with MKL, you should do so first to produce the module file that is needed by the solver program.

I have tested only on Windows 8.1 X64, but there should be no changes needed for Linux. Please let me know if you run into problems. The last line of the output should be

0.257042531294272 9.162025370381688E-014 4.008343651777284E-003

DSS symmetric solve