Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7111 Discussions

getrf & getrs caused a 'corrupted double-linked list' error

Vishal1
Beginner
500 Views

Hi,

     We are using MKL calls dgetrf & dgetrs from MKL 11.2 Update 2 to solve linear systems of small size (typically 4 X 4 or 9 X 9 or 16 X 16). Here is how the program works - we need to perform computations with systems of different sizes. We allocate memory, compute, free and repeat as required. Eventually, after several allocate-compute-free cycles, when we free the dynamically allocated array 'a' in the documentation at 

https://software.intel.com/en-us/node/520892#642A8C07-088C-408D-BC89-D0F2A6E75416

we find that the program crashes with the error 'corrupted double-linked list' which suggests that 'ipiv' is over-writing/changing the malloc information for 'a'.

When we avoid the dgetrf & dgetrs calls by using an alternative function (that does not require solving the system), we find that the code executes correctly and exits normally.

Furthermore, we have determined that allocating 'ipiv' to be of size 1 larger than suggested by the documentation at 

https://software.intel.com/en-us/node/520877#E4779E02-346C-4670-92AB-C67BD8559051

fixes the bug i.e. we do not see the 'corrupted double-linked list' error anymore and the program exits normally. Can you please confirm if the documentation is in error?

We are happy to supply our codebase along with a makefile to help resolve the problem.

Vishal Kasliwal & Michael Royster

Department of Physics, Drexel University

0 Kudos
1 Reply
mecej4
Honored Contributor III
500 Views

The information that you have given is not sufficient to reproduce the problem. Here is a modified version of the DGETRS example distributed with MKL. It allocates, solves and deallocates the matrix, R.H.S. and IPIV arrays for a 4 X 4 matrix with NRHS=2, using the data provided with MKL for the DGETRS example. It does the allocate-factorize-solve-deallocate cycle 10,000 times. There were no errors (with the Windows version of MKL).

    Program xdgetrs
      Integer nin, nout
      Parameter (nin=5, nout=6)
      Integer nmax, lda, nrhmax, ldb
      Parameter (nmax=8, lda=nmax, nrhmax=nmax, ldb=nmax)
      Character trans
      Parameter (trans='N')
      Integer i, ifail, info, j, n, nrhs
      Double Precision, Allocatable :: a(:, :), b(:, :)
      Double Precision as(lda, nmax), bs(ldb, nrhmax)
      Integer, Allocatable :: ipiv(:)
      Double Precision pert(2)
      Integer astat
      External dgetrf, dgetrs
!
      Write (nout, *) 'DGETRS Example Program Results'
      Read (nin, *)
      Read (nin, *) n, nrhs
      If (n<=nmax .And. nrhs<=nrhmax) Then
        Read (nin, *)((as(i,j),j=1,n), i=1, n)
        Read (nin, *)((bs(i,j),j=1,nrhs), i=1, n)
        Do kiter = 1, 10000
          Allocate (a(n,n), b(n,nrhs), ipiv(n), Stat=astat)
          If (astat/=0) Stop 'Error allocating A, B and IPIV'
          Call random_number(pert)
          a(1:n, 1:n) = as(1:n, 1:n)*(1D0+0.01*pert(1))      !perturb
          b(1:n, 1:nrhs) = bs(1:n, 1:nrhs)*(1D0+0.01*pert(2))
          Call dgetrf(n, n, a, n, ipiv, info)
!
          If (info==0) Then
            Call dgetrs(trans, n, nrhs, a, n, ipiv, b, n, info)
            ifail = 0
            If (mod(kiter,1000)==0) Write (*, '(1x,i5,2x,2ES12.4)')  &
              kiter, (sum(b(:,i)), i=1, nrhs)
          Else
            Write (nout, *) 'The matrix factorization failed'
          End If
          Deallocate (a, b, ipiv)
        End Do
      End If
      Stop
!
    End Program xdgetrs

 

0 Kudos
Reply