- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have been using the following code compiled with Intell Fortran on Linux clusters linked with MKL 10.2.2.025 for a number of years. We are trying to update to Intel 15 compilers and MKL 11.2 update 2. The symmetric solve produces wrong answers using the new compiler and libraries. The code has not changed. When we include the symmetric terms and solve it as a symmetric_structure matrix we get the right answers.
Is any one else having problems with symmetric solves with the current compiler and library?
Load the data into indx, vall, and y_rt ...
lcount = 0
do row=1,n_solver
pntr(row) = lcount +1
col = row
lcount = lcount+1
indx(lcount) = col
vall(lcount) = ad(row)
col = row + 1
! if( (col <= n*m).and.(au1(row) /= 0.)) then
if( (col <= n*m) ) then
lcount = lcount+1
indx(lcount) = col
vall(lcount) = au1(row)
endif
col = row - 1 + m
! if( (col <= n*m).and.(au2(row) /= 0.)) then
if( (col <= n*m) ) then
lcount = lcount+1
indx(lcount) = col
vall(lcount) = au2(row)
endif
col = row + m
! if( (col <= n*m).and.(au3(row) /= 0.)) then
if( (col <= n*m) ) then
lcount = lcount+1
indx(lcount) = col
vall(lcount) = au3(row)
endif
col = row + 1 + m
! if( (col <= n*m).and.(au4(row) /= 0.)) then
if( (col <= n*m) ) then
lcount = lcount+1
indx(lcount) = col
vall(lcount) = au4(row)
endif
end do
! Take care of last pntr
pntr(n_solver+1) = lcount + 1
non_zeros=lcount
error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, &
n_solver, indx, non_zeros )
error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm )
error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall )
nRhs=1 !number of right hand sides
error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt )
x_rt is correct under Intel 11/MKL 10.2.2 but wrong with intel 15/MKL 11.4 update 2
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Update: It also returns a correct solution for Intel 13.1.117. So the problem is occurring somewhere between the Intel 13 and Intel 15 releases.
I also forgot to include the compilation and link options. We do link in FFTW and which does have some MKL interfaces in it which may overlap the MKL library.
mpiifort -fpe0 -D_UNIX -r8 -O0 -I /theory/kellerd/shu_test/Draco_i2_5_iSNB_XBT_FPEOS_Blizzard/finclude -fpp -convert big_endian -I/opt/LLE/local/fftw-3.3.1/include -I/opt/LLE/local/fftw-3.3.1/api -I/opt/LLE/local/hdf5-1.8.2/include -I/opt/LLE/local/hdf5-1.8.2/lib -c
mpiifort -o draco_hp15 hp15/*.o -fpe0 -D_UNIX -Wl,--start-group -L/opt/LLE/local/fftw-3.3.1/lib -lfftw3 -liomp5 -lpthread -lz -lmpiif -lmpi -lmkl_core -lmkl_sequential -lmkl_intel_thread -lmkl_intel_lp64 -L/opt/LLE/local/hdf5-1.8.2/lib -lhdf5 -L/opt/LLE/local/hdf5-1.8.2/lib -lhdf5_fortran -Wl,--end-group
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The DSS example code provided with MKL works correctly (checked with Matlab) with IFort 15.0.2 and MKL 11.2. I suggest that you try it on your system, and if it runs correctly, provide a reproducer based on your code that enables the problem to be seen.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes, please check the problem with the latest version of MKL 11.2 update 2 and then in the case if you will see the problem is still exists, give us the example of this code which we can compile and debug this problem. You can take the eval version of MKL for this evaluation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I developed a small test that demonstrates the error. It creates the correct solution vector under Intel 13 and the incorrect solution under Intel 15 with MKL 11.2 update 2. The input file is much smaller than our number array size but still is 623862720 bytes and attached.
>cat run
#!/bin/sh
# can be more optimized on newer architectures
mpiifort -fpe0 -r8 -O3 -xSSE4.2 -fpp -I${MKL_DIR}/include -L -L$MKL_LIB -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread MKL_DSS_radiation_DEBUGGING.f90
ldd a.out
./a.out
>cat out.13
linux-vdso.so.1 => (0x00007fff5accc000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
libmkl_intel_lp64.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b60b33f4000)
libmkl_intel_thread.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b60b3b05000)
libmkl_core.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_core.so (0x00002b60b4a99000)
libiomp5.so => /opt/lic/intel13/composer_xe_2013.1.117/compiler/lib/intel64/libiomp5.so (0x00002b60b5ca8000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
libmpi.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpi.so.4 (0x00002b60b5faa000)
libmpigf.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpigf.so.4 (0x00002b60b65d0000)
librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
start solving the system
TEST MKL DSS; Execution time= 0.2031250
50156093319.1081 -89969204211.7639 89514538131.7439 <<<<<<<< correct numbers
>cat out.15
linux-vdso.so.1 => (0x00007fffbf1ff000)
libmkl_intel_lp64.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b2d13f89000)
libmkl_intel_thread.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b2d1489c000)
libmkl_core.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.so (0x00002b2d15cbd000)
libiomp5.so => /opt/lic/intel15/composer_xe_2015.2.164/compiler/lib/intel64/libiomp5.so (0x00002b2d1781c000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
libmpifort.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpifort.so.12 (0x00002b2d17b6d000)
libmpi.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpi.so.12 (0x00002b2d17df7000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
start solving the system
TEST MKL DSS; Execution time= 0.1875000
73375.2577664697 -359933.504809541 280913.663845399 <<<<<< incorrect numbers
>cat MKL_DSS_radiation_DEBUGGING.f90
INCLUDE 'mkl_dss.f90' ! Include the standard DSS "header file." PROGRAM MKL_DSS_radiation_DEBUGGING use mkl_dss use ifport implicit none !----------------------------------------------- ! L o c a l V a r i a b l e s !----------------------------------------------- integer, parameter :: DP = KIND(1.0D0) integer, parameter :: MAX_NZ_ROW = 9 logical, save:: initialized=.FALSE. integer :: m_solver,n_solver integer :: n, m, row, col, lcount, non_zeros, error, perm(1),nRhs type(MKL_DSS_HANDLE),save :: handle ! Allocate storage for the solver handle. integer, allocatable, dimension(:), save:: indx, pntr real(KIND=DP), allocatable, dimension(:), save:: vall,x_rt,y_rt real(KIND=DP) :: al4,al3,al2,al1,ad,au4,au3,au2,au1 !real(8) :: t0, t1, MPI_WTIME real(4) :: t0, t1, ta(2) character*(10) :: suffix integer :: id, nthreads, omp_get_thread_num, omp_get_num_threads integer :: ierr real(8) :: x0 integer(4) :: ic0,ic1,crate !----------------------------------------------- ! call MPI_INIT(ierr) !DSS implementation ! val - values ! pntr - rowIndex ! indx - colums ! ncols = n_solver ! nrows = ncols m_solver=60 !400 !400 !120 (for test) !=je-js+1 n_solver=2280 !200 !2000 !498 (for test) !=ie-is+1 n_solver=n_solver*m_solver m = m_solver n = n_solver/m_solver if(.NOT. initialized)then if( .NOT. allocated(indx) ) allocate(indx(MAX_NZ_ROW*n_solver)) if( .NOT. allocated(pntr) ) allocate(pntr(n_solver+1)) if( .NOT. allocated(vall) ) allocate(vall(MAX_NZ_ROW*n_solver)) if( .NOT. allocated(x_rt) ) allocate(x_rt(n_solver)) if( .NOT. allocated(y_rt) ) allocate(y_rt(n_solver)) endif indx(:)=0 pntr(:)=0 vall(:)=0.0d0 x_rt(:)=0.0d0 y_rt(:)=0.0d0 open(555,FILE="input_ad_au_y_rt") lcount = 0 do row=1,n_solver pntr(row) = lcount +1 read(555,*) ad,au1,au2,au3,au4,y_rt(row) ! write(588,'(10(ES23.15,1X))') ad,au1,au2,au3,au4,y_rt(row) col = row lcount = lcount+1 indx(lcount) = col vall(lcount) = ad col = row + 1 if( col <= n*m ) then lcount = lcount+1 indx(lcount) = col vall(lcount) = au1 endif col = row - 1 + m if( col <= n*m ) then lcount = lcount+1 indx(lcount) = col vall(lcount) = au2 endif col = row + m if( col <= n*m ) then lcount = lcount+1 indx(lcount) = col vall(lcount) = au3 endif col = row + 1 + m if( col <= n*m ) then lcount = lcount+1 indx(lcount) = col vall(lcount) = au4 endif end do ! ! Take care of last pntr ! pntr(n_solver+1) = lcount + 1 non_zeros=lcount if (.NOT. initialized) then ! Initialize the solver. error = DSS_CREATE( handle, MKL_DSS_DEFAULTS ) if (error /= MKL_DSS_SUCCESS) goto 999 ! Define the non-zero structure of the matrix. error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, & & n_solver, indx, non_zeros ) if (error /= MKL_DSS_SUCCESS) goto 999 ! Reorder the matrix. error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm ) if (error /= MKL_DSS_SUCCESS) goto 999 initialized = .TRUE. endif print*,"start solving the system" call SYSTEM_CLOCK(ic0,crate) t0 = dble(ic0)/dble(crate) ! ! Factor the matrix. ! error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall ) if (error /= MKL_DSS_SUCCESS) goto 999 ! ! Solve for a vector x_rt, given an input vector y_rt as b ! nRhs=1 !number of right hand sides error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt ) if (error /= MKL_DSS_SUCCESS) goto 999 call SYSTEM_CLOCK(ic1,crate) t1 = dble(ic1)/dble(crate) print*,"TEST MKL DSS; Execution time=",t1-t0 print*,sum(x_rt),minval(x_rt),maxval(x_rt) open(566,FILE="x_rt_based_on_input_"//TRIM(suffix)) do row=1,n_solver write(566,*) x_rt(row) enddo stop 999 WRITE(*,*) "Solver returned error code ", error END PROGRAM MKL_DSS_radiation_DEBUGGING
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Noticed my typo after I posted the example. My note said "the input file is smaller than out number array size". I meant to say it is smaller than our typical array sizes. The attached input DOES demonstrate the problem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
DSS lets you compute some statistics, among them the determinant. For your "small" matrix, I obtained this:
Determinant = 2.407 X 10**( -1057550)
Applying Pardiso to the same matrix, I obtained:
non-zeros in A: 683819 number of nonzeros in L 5914460 number of nonzeros in U 1
Something is seriously wrong if there is only one nonzero element in the U factor (I need to check if I have made an error in obtaining these results). In these circumstances, your summary digest of the solution (min, max and sum of elements of X) is not useful, probably not reproducible across platforms, and may even change with slightly different compiler options. There may be similar problems with your "large" matrices. Before accepting a solution, you should probably obtain the residuals after solving the equations. For the "small" matrix, I obtained residuals as large as 0.001.
Note that the data file containing the "small" matrix is large (> 600 MB) in formatted text form. By placing the data into an unformatted file and cutting out the portions that are not used in the calculation, you could reduce it to about 10 MB.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for looking at this for us.
I admit to really not knowing much about solvers but the issue is that the solution to an input set returns a solution that my physcis folks say is 'right' on Intel 13 is 'wrong' on Intel 15. It would seem that if the sum, min and max of the solution differ significantly from an Intel 13 run to an Intel 15 run something significant has changed and that our 'right' answers have significantly changed with Intel 15. It is not clear to me from your note if you think there is something 'seriously wrong' with the input or coding or from the result?
I modified the source to include additional statistics:
print*,sum(x_rt),minval(x_rt),maxval(x_rt)
call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,11,"determinant")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Determinant=",statistic(2)," X 10 ",statistic(1)
call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,7,"inertia")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"pos =",statistic(1)
print*,"neg =",statistic(2)
print*,"zer =",statistic(3)
call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,8,"solvemem")
error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic)
print*,"Solvemem=",statistic(1)
Intel 13:
50204591221.3627 -89969204206.9475 89514538126.9583
Determinant= 1.75565552825196 X 10 -1336812.00000000
pos = 67894.0000000000
neg = 68906.0000000000
zer = 0.000000000000000E+000
Solvemem= 69469.0000000000
Intel 15
73375.2577673518 -359933.504809540 280913.663845398
Determinant= -2.32535455034809 X 10 -1057541.00000000
pos = 65547.0000000000
neg = 71253.0000000000
zer = 0.000000000000000E+000
Solvemem= 69867.0000000000
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can further the testing if you can put together a smaller example, with as few unknowns as possible, but with the same structure as the matrices that you have shown, and approved by the "physics folks" as correct with IFort 13/MKL 10.2.2. The problem with large size matrix problems is that you are restricted to fewer checks on correctness. On the other hand, when you whittle a problem down you may be left with an absurd remnant.
I shall be happy to share my verification code with you once I have done some checks.
Note this from #8:
Determinant= 1.76 X 10 -1336812
It is possible for a matrix to be well-conditioned and yet have a determinant nearly equal to 0. However, if the problem comes from a physics model and the variables are reasonably scaled this should not happen. Ask the physics folks if they are comfortable with this attoscopic value for the determinant. If this value is wrong because of a coding error, we can track that down and fix it. However, if the problem as formulated is nearly impossible to obtain an accurate solution for, the physics people may have a much tougher problem.
A more basic question: is it expected/reasonable that the matrix is singular? If not, there can be no zeroes on the principal diagonals of the L and U factors. Therefore, the minimum number of non-zeroes in L and U is equal to the number of equations.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The physics guys are not around today so I tried to do what I can to further help find this problem. I cut the grid size down to a 14x30 grid so
m =8 n=112. It still demostrates the problem. The solution grid when plotted follows. Note they show that the garray1 field on the left axis is "wrong" for 15, "right" for 13. I also attached attached the input file (still in text format) for this smaller grid size. 451535.
cat out.15_30
linux-vdso.so.1 => (0x00007fff86871000)
libmkl_intel_lp64.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002b41f3e18000)
libmkl_intel_thread.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_intel_thread.so (0x00002b41f472b000)
libmkl_core.so => /opt/lic/intel15/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.so (0x00002b41f5b4c000)
libiomp5.so => /opt/lic/intel15/composer_xe_2015.2.164/compiler/lib/intel64/libiomp5.so (0x00002b41f76ab000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
libmpifort.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpifort.so.12 (0x00002b41f79fc000)
libmpi.so.12 => /opt/lic/intel15/impi/5.0.3.048/intel64/lib/libmpi.so.12 (0x00002b41f7c86000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
start solving the system
TEST MKL DSS; Execution time= 7.8125000E-02
295618.772828955 -312701.448947056 220058.681258707
Determinant= -1.85272122511410 X 10 -98200.0000000000
pos = 6727.00000000000
neg = 5873.00000000000
zer = 0.000000000000000E+000
Solvemem= 5647.00000000000
cat out.13_30
linux-vdso.so.1 => (0x00007fff7e4b4000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003f08000000)
libmkl_intel_lp64.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_lp64.so (0x00002ac989820000)
libmkl_intel_thread.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_thread.so (0x00002ac989f31000)
libmkl_core.so => /opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_core.so (0x00002ac98aec5000)
libiomp5.so => /opt/lic/intel13/composer_xe_2013.1.117/compiler/lib/intel64/libiomp5.so (0x00002ac98c0d4000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f08400000)
libmpi.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpi.so.4 (0x00002ac98c3d6000)
libmpigf.so.4 => /opt/lic/intel13/impi/4.1.0.024/intel64/lib/libmpigf.so.4 (0x00002ac98c9fc000)
librt.so.1 => /lib64/librt.so.1 (0x0000003f08800000)
libm.so.6 => /lib64/libm.so.6 (0x0000003f07800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003f07c00000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f11800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003f07400000)
start solving the system
TEST MKL DSS; Execution time= 1.5625000E-02
-19306486435.1262 -5772347418.44442 3505444963.56421
Determinant= -3.76569356065273 X 10 -117523.000000000
pos = 6477.00000000000
neg = 6123.00000000000
zer = 0.000000000000000E+000
Solvemem= 5603.00000000000
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you have a writeup with the P.D.E.,boundary and initial conditions specified?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just recognized today that I had sent in incorrect data input file. I apologize for any wasted effort. The net is still the same however. Intel 15 returns wrong answers. I added a double check after the solve. The solution returned when used to recompute the y_rt supplied has 9 errors:
Intel 13 output:
start solving the system
TEST MKL DSS; Execution time= 0.1109009
0.257042531294272 9.162025370382090E-014 4.008343651777284E-003
Determinant= 2.12161563270491 X 10 -2990.00000000000
pos = 420.000000000000
neg = 0.000000000000000E+000
zer = 0.000000000000000E+000
Solvemem= 159.000000000000
Intel 15 output:
start solving the system
TEST MKL DSS; Execution time= 7.7514648E-02
0.251832820371750 9.162025183722943E-014 4.008343651777284E-003
Determinant= 1.62949566754441 X 10 -2987.00000000000
pos = 420.000000000000
neg = 0.000000000000000E+000
zer = 0.000000000000000E+000
Solvemem= 262.000000000000
not equal 121 2.405238875359285E-019 2.065808031766965E-019 <<<<<< double check should at least be close
not equal 151 2.405238875359036E-019 2.068078684773001E-019
not equal 181 2.405238875359422E-019 2.038278901649021E-019
not equal 211 2.405238875359251E-019 1.844924519720349E-019
not equal 241 2.405238875359413E-019 -6.431842696536678E-020
not equal 331 3.299385792934408E-020 -7.117734061904493E-014
not equal 332 9.889113991683395E-020 -6.104485185954297E-014
not equal 361 3.303422135645092E-020 -1.172086994261061E-013
not equal 391 8.509130524910173E-013 6.084941155961990E-013
A slightly modified source file and a new input file are provided. They clearly demonstrate a problem and should be easily reproduced
INCLUDE 'mkl_dss.f90' ! Include the standard DSS "header file." PROGRAM MKL_DSS_radiation_DEBUGGING use mkl_dss use ifport implicit none !----------------------------------------------- ! L o c a l V a r i a b l e s !----------------------------------------------- integer, parameter :: DP = 8 !KIND(1.0D0) integer, parameter :: MAX_NZ_ROW = 9 logical, save:: initialized=.FALSE. integer :: m_solver,n_solver integer :: n, m, row, col, lcount, non_zeros, error, nRhs type(MKL_DSS_HANDLE),save :: handle ! Allocate storage for the solver handle. integer, allocatable, dimension(:), save:: indx, pntr, perm real(KIND=DP), allocatable, dimension(:), save:: vall,x_rt,y_rt,y_rt_check real(KIND=DP) :: al4,al3,al2,al1,ad,au4,au3,au2,au1 !real(8) :: t0, t1, MPI_WTIME real(4) :: t0, t1, ta(2) character*(10) :: suffix integer :: id, nthreads, omp_get_thread_num, omp_get_num_threads integer :: ierr,statreq(10) real(8) :: x0, statistic(10) integer(4) :: ic0,ic1,crate !----------------------------------------------- ! call MPI_INIT(ierr) !DSS implementation ! val - values ! pntr - rowIndex ! indx - colums ! ncols = n_solver ! nrows = ncols m= 30 !60 !400 !400 !120 (for test) !=je-js+1 n= 14 !2280 !200 !2000 !498 (for test) !=ie-is+1 n_solver = n * m if(.NOT. initialized)then if( .NOT. allocated(indx) ) allocate(indx(MAX_NZ_ROW*n_solver)) if( .NOT. allocated(pntr) ) allocate(pntr(n_solver+1)) if( .NOT. allocated(vall) ) allocate(vall(MAX_NZ_ROW*n_solver)) if( .NOT. allocated(x_rt) ) allocate(x_rt(n_solver)) if( .NOT. allocated(y_rt) ) allocate(y_rt(n_solver)) if( .NOT. allocated(y_rt_check) ) allocate(y_rt_check(n_solver)) if( .NOT. allocated(perm)) allocate(perm(n_solver)) endif indx(:)=0 pntr(:)=0 vall(:)=0.0d0 x_rt(:)=0.0d0 y_rt(:)=0.0d0 open(555,FILE="input_30x14") lcount = 0 do row=1,n_solver pntr(row) = lcount +1 read(555,*) ad,au1,au2,au3,au4,y_rt(row) !+ashv write(588,'(10(ES23.15,1X))') ad,au1,au2,au3,au4,y_rt(row) col = row lcount = lcount+1 indx(lcount) = col vall(lcount) = ad col = row + 1 if( col <= n*m ) then lcount = lcount+1 indx(lcount) = col vall(lcount) = au1 endif col = row - 1 + m if( col <= n*m ) then lcount = lcount+1 indx(lcount) = col vall(lcount) = au2 endif col = row + m if( col <= n*m ) then lcount = lcount+1 indx(lcount) = col vall(lcount) = au3 endif col = row + 1 + m if( col <= n*m ) then lcount = lcount+1 indx(lcount) = col vall(lcount) = au4 endif end do ! ! Take care of last pntr ! pntr(n_solver+1) = lcount + 1 non_zeros=lcount if (.NOT. initialized) then ! Initialize the solver. error = DSS_CREATE( handle, MKL_DSS_DEFAULTS ) if (error /= MKL_DSS_SUCCESS) goto 999 ! Define the non-zero structure of the matrix. error = DSS_DEFINE_STRUCTURE( handle, MKL_DSS_SYMMETRIC, pntr, n_solver, & & n_solver, indx, non_zeros ) if (error /= MKL_DSS_SUCCESS) goto 999 ! Reorder the matrix. ! error = DSS_REORDER( handle, MKL_DSS_DEFAULTS, perm ) error = DSS_REORDER( handle, MKL_DSS_AUTO_ORDER, perm ) if (error /= MKL_DSS_SUCCESS) goto 999 initialized = .TRUE. endif print*,"start solving the system" call SYSTEM_CLOCK(ic0,crate) t0 = dble(ic0)/dble(crate) ! ! Factor the matrix. ! error = DSS_FACTOR_REAL( handle, MKL_DSS_INDEFINITE, vall ) if (error /= MKL_DSS_SUCCESS) goto 999 ! ! Solve for a vector x_rt, given an input vector y_rt as b ! nRhs=1 !number of right hand sides error = DSS_SOLVE_REAL(handle, MKL_DSS_DEFAULTS, y_rt, nRhs, x_rt ) if (error /= MKL_DSS_SUCCESS) goto 999 call SYSTEM_CLOCK(ic1,crate) t1 = dble(ic1)/dble(crate) print*,"TEST MKL DSS; Execution time=",t1-t0 print*,sum(x_rt),minval(x_rt),maxval(x_rt) call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,11,"determinant") error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic) print*,"Determinant=",statistic(2)," X 10 ",statistic(1) call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,7,"inertia") error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic) print*,"pos =",statistic(1) print*,"neg =",statistic(2) print*,"zer =",statistic(3) call MKL_CVT_TO_NULL_TERMINATED_STR(statreq,8,"solvemem") error = DSS_STATISTICS(handle,MKL_DSS_DEFAULTS,statreq,statistic) print*,"Solvemem=",statistic(1) !then calculate the corresponding right-hand-side y_rt by multipying x_rt by the matrix storred in vall() array y_rt_check(:)=0.0d0 do row=1,n_solver do lcount=pntr(row),pntr(row+1)-1 col = indx(lcount) y_rt_check(row) = y_rt_check(row) + vall(lcount)*x_rt(col) if(col/=row) y_rt_check(col) = y_rt_check(col) + vall(lcount)*x_rt(row) enddo enddo open(566,FILE="x_rt_based_on_input_"//TRIM(suffix)) do row=1,n_solver write(566,*) x_rt(row) if (abs(y_rt(row)-y_rt_check(row)) > abs(1.0e-6*y_rt(row))) print*,'not equal',row,y_rt(row), y_rt_check(row) enddo stop 999 WRITE(*,*) "Solver returned error code ", error END PROGRAM MKL_DSS_radiation_DEBUGGING
.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I do not have the specific older version, but I think that you will reproduce the old results with the new compiler if, on line 133 of #5, you replace MKL_DSS_INDEFINITE by MKL_DSS_DEFAULTS which, according to the current documentation, sets MKL_DLL_POSITIVE_DEFINITE.
Of course, you have to check whether this is a correct setting. You can also check your documentation for the old and new versions to see if the default value for the call to DSS_FACTOR changed with the version.
This may also indicate that, as you suspected, there is an error in the newer version of MKL. This is likely since the DSS calls are mapped into equivalent PARDISO calls, and an error in doing so may have been introduced recently. You may wish to consider rewriting the solver part of your code to make direct PARDISO calls instead of going through the DSS interface.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Looks like an error has been introduced.
Replacing MKL_DSS_INDEFINITE by MKL_DSS_DEFAULTS or MKL_DLL_POSITIVE_DEFINITE with Intel 15 results in:
start solving the system
TEST MKL DSS; Execution time= 6.2500000E-02
0.257042531294272 9.162025370385757E-014 4.008343651777285E-003
Determinant= 2.12161563270509 X 10 -2990.00000000000
MKL-DSS-UTIL-Fatal, Bad facility code
pos = -2990.00000000000
neg = 2.12161563270509
zer = 0.000000000000000E+000
Solvemem= 262.000000000000
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Would you please share the equivalent PARDISO version of this code that works?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The zip file is attached. It contains the input file (from you, but included for consistency), a utility to read the input file and write an unformatted Fortran file in a format that I use when testing problems with MKL/Pardiso, and the program to read that unformated file and solve the problem with Pardiso.
The utility is self-contained, and does not need to be linked with MKL. The solver program should be linked with MKL, and if you have not previously compiled the file mkl_pardiso.f90 that is provided with MKL, you should do so first to produce the module file that is needed by the solver program.
I have tested only on Windows 8.1 X64, but there should be no changes needed for Linux. Please let me know if you run into problems. The last line of the output should be
0.257042531294272 9.162025370381688E-014 4.008343651777284E-003
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page