Solved: DSS Fortran Error

Eric_C_1 · ‎09-21-2015

Hi all,

Part of my Ph.D. research involves solving coupled reaction diffusion equations on arbitrary closed surfaces using Finite Element Method. In the past I had been using the ?GESV subroutine to solve the resulting systems of equations, but due to the large resolution of my meshes, this became impractical to use, especially since the matrix is sparse. I then discovered Intel's sparse solvers and have been attempting to implement the DSS routine.

This is where my problem lies. For the matrix equation Ax=b, my A matrix is symmetric indefinite. I have no difficulty solving a symmetric system of equations like the one below:

...

INTEGER, PARAMETER :: nNonZeros=9
INTEGER, PARAMETER :: nRows=5,nCols=5
DOUBLE PRECISION, DIMENSION(nNonZeros) :: rValues
DOUBLE PRECISION, DIMENSION(nRows) :: rRhsValues
DOUBLE PRECISION, DIMENSION(nRows) :: rSolValues
INTEGER, DIMENSION(nRows+1) :: rowIndex
INTEGER, DIMENSION(nNonZeros) :: columns
INTEGER :: error
INTEGER*8 :: fhandle

rValues=(/9.,1.5,6.,0.75,3.,0.5,12.,0.625,16./)
rRhsValues=(/1.,2.,3.,4.,5./)
rowIndex=(/1,6,7,8,9,10/)
columns=(/1,2,3,4,5,2,3,4,5/)

error=DSS_CREATE(fhandle,MKL_DSS_DEFAULTS)

error=DSS_DEFINE_STRUCTURE( fhandle, MKL_DSS_SYMMETRIC, rowIndex, nRows, nCols, columns, nNonZeros )

error=DSS_REORDER(fhandle, MKL_DSS_DEFAULTS,0)

error=DSS_FACTOR_REAL(fhandle,MKL_DSS_POSITIVE_DEFINITE, rValues)

error=DSS_SOLVE_REAL(fhandle, MKL_DSS_DEFAULTS, rRhsValues, 1, rSolValues)

WRITE(*,*) rSolValues

...

The solution correctly gives -326.333333333331 982.999999999994 163.416666666666
397.999999999998 61.4999999999996

However, if I change one value in the matrix, say convert the 12 on the diagonal to a -12, the matrix will now be symmetric indefinite. I also change the opt parameter in the DSS_FACTOR_REAL subroutine from MKL_DSS_POSITIVE_DEFINITE to MKL_DSS_INDEFINITE. The updated code is displayed below:

...

INTEGER, PARAMETER :: nNonZeros=9
INTEGER, PARAMETER :: nRows=5,nCols=5
DOUBLE PRECISION, DIMENSION(nNonZeros) :: rValues
DOUBLE PRECISION, DIMENSION(nRows) :: rRhsValues
DOUBLE PRECISION, DIMENSION(nRows) :: rSolValues
INTEGER, DIMENSION(nRows+1) :: rowIndex
INTEGER, DIMENSION(nNonZeros) :: columns
INTEGER :: error
INTEGER*8 :: fhandle

rValues=(/9.,1.5,6.,0.75,3.,0.5,-12.,0.625,16./)
rRhsValues=(/1.,2.,3.,4.,5./)
rowIndex=(/1,6,7,8,9,10/)
columns=(/1,2,3,4,5,2,3,4,5/)

error=DSS_CREATE(fhandle,MKL_DSS_DEFAULTS)

error=DSS_DEFINE_STRUCTURE( fhandle, MKL_DSS_SYMMETRIC, rowIndex, nRows, nCols, columns, nNonZeros )

error=DSS_REORDER(fhandle, MKL_DSS_DEFAULTS,0)

error=DSS_FACTOR_REAL(fhandle,MKL_DSS_INDEFINITE, rValues)

error=DSS_SOLVE_REAL(fhandle, MKL_DSS_DEFAULTS, rRhsValues, 1, rSolValues)

WRITE(*,*) rSolValues

...

The program throws an "MKL-DSS-DSS-Error, Zero pivot detected" error during the DSS_FACTOR_REAL subroutine.

Why does it say this? The matrix does not have any zero pivots since in reduced row echelon form it is simply the identity matrix.

Any help would be appreciated. What am I doing wrong?

mecej4 · ‎09-21-2015

In your code you have the wrong type for PERM (by default, it is of type REAL, whereas it should be integer). However, since the preceding argument has the value MKL_DSS_DEFAULTS, this error is not yet affecting the run.

Your code gave me the correct result on Windows. ~~I'll try the same code on a Linux system in the morning~~.

On Suse Linux x64, using IFort 12.1.7.367 Build 20120928 and MKL 10.3.12, I changed 'Subroutine' to 'Program' in your reac_diff.f, added include 'mkl_dss.f77', and obtained the correct results. The makefile was not needed since the code in the subroutine is completely independent of outside code other than MKL, Fortran runtime and OS.

On the same system with IFort 15.0.2.164 and MKL 11.2.2, the compiler noted the error regarding PERM, unlike the older compiler, which ignored it. The results were again correct. In both cases, I used the simple -mkl option of the compiler.

Let us try to take the same steps (change 'Subroutine' to 'Program', add declaration for PERM) on your system and run the code of #7 independently of all the other source files named in your makefile.

View solution in original post

mecej4 · ‎09-21-2015

Which version of MKL and IFort did you use?

The example code that you showed is essentially the same as that in the MKL/examples/solverf/source directory, file dss_sym_f.f.

I changed the seventh element of the values array to -12.0 as you did, changed the second argument in the call of dss_factor_real() to MKL_DSS_INDEFINITE, and ran the example to receive the solution ~~[-3.2633E+02 9.8300E+02 1.6342E+02 3.9800E+02 6.1500E+01]~~ [-1.5300E+00 8.5901E+00 -1.0150E+00 8.2360E+00 5.9938E-01].

Eric_C_1 · ‎09-21-2015

Hi mecej4,

You got that answer using -12 and MKL_DSS_INDEFINITE? Using MATLAB I got the following:

-1.5300
8.5901
-1.0150
8.2360
0.5994

And I'm pretty much a newbie, so I'm not even sure how to check ifort and MKL versions. Is there a linux command for that? If it helps I'm using the TACC Stampede supercomputer to run all of this. Here is the user guide if that helps at all.

https://portal.tacc.utexas.edu/user-guides/stampede#appdev

Thanks,

Eric

mecej4 · ‎09-21-2015

Eric: Sorry, I posted the results of the unmodified example by mistake. I have made a correction above.

You can find the version of MKL either by looking at the header file mkl.h, or by calling the query subroutine, call mkl_get_version_string( buf ), where buf is CHARACTER(len=193) and then writing out the buffer to standard output.

The Intel Fortran compiler will display the version if you use the -logo option.

Viewing the publication/release date of a manual may not be reliable since the manual may not be updated in synchronization with the software.

Eric_C_1 · ‎09-21-2015

Output using -logo :

Intel(R) Fortran 13.1-2134
GNU ld version 2.20.51.0.2-5.42.el6 20100205

MKL Version subroutine gives:

Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130123 for Intel(R)
64 architecture applications

mecej4 · ‎09-21-2015

With the Windows MKL 11.0.1, 32-bit, I get the correct results. Please post the complete source code, and tell me the compiler options that you used. Did you use LP64 or ILP64?

Eric_C_1 · ‎09-21-2015

The file is posted.

The file is part of at least 30 other routines, so the compilation is done with a Make file. I have attached that as well.

It looks like I used LP64.

mecej4 · ‎09-21-2015

In your code you have the wrong type for PERM (by default, it is of type REAL, whereas it should be integer). However, since the preceding argument has the value MKL_DSS_DEFAULTS, this error is not yet affecting the run.

Your code gave me the correct result on Windows. ~~I'll try the same code on a Linux system in the morning~~.

On Suse Linux x64, using IFort 12.1.7.367 Build 20120928 and MKL 10.3.12, I changed 'Subroutine' to 'Program' in your reac_diff.f, added include 'mkl_dss.f77', and obtained the correct results. The makefile was not needed since the code in the subroutine is completely independent of outside code other than MKL, Fortran runtime and OS.

On the same system with IFort 15.0.2.164 and MKL 11.2.2, the compiler noted the error regarding PERM, unlike the older compiler, which ignored it. The results were again correct. In both cases, I used the simple -mkl option of the compiler.

Let us try to take the same steps (change 'Subroutine' to 'Program', add declaration for PERM) on your system and run the code of #7 independently of all the other source files named in your makefile.

Eric_C_1 · ‎09-22-2015

I apologize, I must have missed the perm declaration when uploading the code. Normally I have to use the make file since there is a lot more code in that file doing other things (FEM solver, alternate solver using Laplace-Beltrami method ...) which are all linked to the other files.

If I change the file to a program instead of a subroutine, include 'mkl_dss.f77' and compile it independently with -mkl, it works!

However, normally I would call this code every so often to run. Can that be done if it is a program?

Also could you help me figure out how to run it with the Make file?

mecej4 · ‎09-22-2015

If you kept the code in a subroutine, called that subroutine (no arguments passed, correct?) from somewhere else in your code, and the subroutine failed to give the correct solution, that is an indication that some part of your code has an error that causes it to corrupt the local variables of the subroutine. Tracking down such bugs is usually a difficult task, but the compiler and other tools can help with this.

I suggest that you apply your FEM code to a trivial case (trivial in the sense that the solution is known analytically, but we will not let the code be aware of that) such as a rectangular plate in simple shear, solve the resulting simultaneous linear equations using Lapack, and check the solution. Lapack is suitable for dense matrices, and is going to be wasteful of memory and CPU time, but its use will be only for establishing a reference solution. Lapack dgesv() has a simpler interface, so finding the bugs will be easier.

Once you have succeeded in forming the equations correctly, you can dump the matrix to a file in one of the standard formats (e.g., Harwell-Boeing RUA format), and run DSS or Pardiso on that file. After that also works correctly, you can put the pieces back together and proceed to your original goal.

Eric_C_1 · ‎09-22-2015

I think I figured it out. Below is the code:

INCLUDE 'mkl_dss.f90'

SUBROUTINE reac_diff

USE MKL_DSS

...

INTEGER, PARAMETER :: nNonZeros=9
INTEGER, PARAMETER :: nRows=5,nCols=5
DOUBLE PRECISION, DIMENSION(nNonZeros) :: rValues
DOUBLE PRECISION, DIMENSION(nRows) :: rRhsValues
DOUBLE PRECISION, DIMENSION(nRows) :: rSolValues
INTEGER, DIMENSION(nRows+1) :: rowIndex
INTEGER, DIMENSION(nNonZeros) :: columns
INTEGER :: error
~~INTEGER*8~~ TYPE(MKL_DSS_HANDLE) :: fhandle

INTEGER :: perm(1)

perm(1)=0

rValues=(/9.,1.5,6.,0.75,3.,0.5,-12.,0.625,16./)
rRhsValues=(/1.,2.,3.,4.,5./)
rowIndex=(/1,6,7,8,9,10/)
columns=(/1,2,3,4,5,2,3,4,5/)

error=DSS_CREATE(fhandle,MKL_DSS_DEFAULTS)

error=DSS_DEFINE_STRUCTURE( fhandle, MKL_DSS_SYMMETRIC, rowIndex, nRows, nCols, columns, nNonZeros )

error=DSS_REORDER(fhandle, MKL_DSS_DEFAULTS,perm)

error=DSS_FACTOR_REAL(fhandle,MKL_DSS_INDEFINITE, rValues)

error=DSS_SOLVE_REAL(fhandle, MKL_DSS_DEFAULTS, rRhsValues, 1, rSolValues)

WRITE(*,*) rSolValues

...

It produced the correct results. Thank you for all your help!