Problem with workspace query in dgesdd

David_Morgenthaler · ‎09-02-2010

I wish to use dgesdd from c to compute an svd using MKL 10.2.6.037 using VC++ 2008. When I do a workspace query to determine the workspace size, dgesdd returns an "optimum" workspace that is smaller than the minimum required workspace (according to the documentation). When I use the "optimum" value, it seems as if some of my other memory is getting stepped on. And when I use other lapack libraries, the "optimum" returned is larger than the value I get here and is also larger than the minimum.

So I suspect something is amiss. Any one have any ideas?

I've extracted the problem to some sample code which follows. When I run this I get:

C:\\Documents and Settings\\morgentd\\My Documents\\wikipedia\\dgesddTest\\Debug>dgesddTest.exe
Optimum lwork size = 47412
Minimum lwork size = 82080

Here's my code:
========================================================

// dgesddTest.c

#include

#include "mkl_cblas.h"

#include "mkl_lapack.h"

#include "mkl.h"

#define MIN(a,b) ((a)<(b)? (a):(b))

#define MAX(a,b) ((a)>(b)? (a):(b))

int main(void) {

// set up the problem

int m=10000;

int n=108;

double *a = malloc(m*n*sizeof(double)); // just let array be filled with

// random junk for this test

// set up for return values (job type 'S')

int minmn = MIN(m,n);

double *u = malloc(m*n*sizeof(double));

double *vt = malloc(n*m*sizeof(double));

double *s = malloc(MAX(1,minmn)*sizeof(double));

// set up for dgesdd

int lwork,info,minlwork;

double *work;

double work0[1]={0.0};

int *iwork;

char jobz[1]={'S'};

iwork=calloc(MAX(1,8*minmn),sizeof(int));

// Workspace query

lwork = -1; info=0;

// signature (FORTRAN) of dgesdd is:

// dgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, iwork, info)

dgesdd(jobz,&m,&n,a,&m,s,u,&m,vt,&n,work0,&lwork,iwork,&info);

if ( info<0 ) {

printf("SVD error at %d: Parameter %d had an illegal value.\\n",__LINE__,-info);

return 1;

} else if ( info >0 ) {

printf("SVD error at %d: dbdsdc did not converge, updating process failed\\n",__LINE__);

return 1;

}

// Allocate work area

lwork = (int)(work0[0]);

minlwork = 3*minmn*minmn+MAX(minmn,4*minmn*minmn+4*minmn);

printf("Optimum lwork size = %d\\nMinimum lwork size = %d\\n",lwork,minlwork);

lwork = MAX(lwork,minlwork);

work = malloc(lwork*sizeof(double));

info = 0;

// Compute SVD

dgesdd(jobz,&m,&n,a,&m,s,u,&m,vt,&n,work,&lwork,iwork,&info);

if ( info<0 ) {

printf("SVD error at %d: Parameter %d had an illegal value.\\n",__LINE__,-info);

return 1;

} else if ( info >0 ) {

printf("SVD error at %d: dbdsdc did not converge, updating process failed\\n",__LINE__);

return 1;

}

free(iwork);iwork=0;

free(work);work=0;

return 0;

}

mecej4 · ‎09-02-2010

You have exposed an error in the MKL documentation, which itself can be traced to an old error in the Lapack documentation, which was subsequently corrected.

See this post in the Lapack Forum. In the MKL documentation, remove the power 2 in the first term of the formula for lwork for the cases jobz = 'S' and 'O' . Correspondingly, replace the line

[cpp]minlwork = 3*minmn*minmn+MAX(minmn,(4*minmn*minmn+4*minmn));[/cpp]

by

[cpp]minlwork = 3*minmn + MAX(maxmn,(4*minmn*minmn + 4*minmn));
[/cpp]

in your C program.

Note that you had incorrectly used minmn as the first argument to MAX, but it had no effect with the values for m and n that you used.

The source code at Netlib of dgesdd.f actually computes minlwork for its internal use, and has the correct formula both in the introductory comments and in the body of the code. Here are the comments in that file as to the size of lwork:

[fxfortran]*          The dimension of the array WORK. LWORK >= 1.
*          If JOBZ = 'N',
*            LWORK >= 3*min(M,N) + max(max(M,N),7*min(M,N)).
*          If JOBZ = 'O',
*            LWORK >= 3*min(M,N) + 
*                     max(max(M,N),5*min(M,N)*min(M,N)+4*min(M,N)).
*          If JOBZ = 'S' or 'A'
*            LWORK >= 3*min(M,N) +
*                     max(max(M,N),4*min(M,N)*min(M,N)+4*min(M,N)).
*          For good performance, LWORK should generally be larger.
*          If LWORK = -1 but other input arguments are legal, WORK(1)
*          returns the optimal LWORK.
[/fxfortran]

David_Morgenthaler · ‎09-02-2010

Thanks for the reply, and with the answer. [I noticed after posting that I'd incorrectly used minmn as the first arguement, but didn't bother to correct it; as you point out, it makes no difference in this case.]

This "documentation" error seems to be somewhat ubuquitous -- I find it not just in the MKL version, but in all the lapack versions I've found. Addtionally, I have found some interesting ways people have dealt with this. For example, in Python's numpy package this routine is used to provide an SVD. After making the workspace query, the souce code says:

[bash]            /* We need to check the result because
               sometimes the "optimal" value is actually
               too small.
               Change it to the maximum of the minimum and the optimal.
            */
            long work0 = (long) *DDATA(work);
            int mn = MIN(m,n);
            int mx = MAX(m,n);

            switch(jobz){
            case 'N':
                    work0 = MAX(work0,3*mn + MAX(mx,6*mn)+500);
                    break;
            case 'O':
                    work0 = MAX(work0,3*mn*mn +                 
                                MAX(mx,5*mn*mn+4*mn+500));
                    break;
            case 'S':
            case 'A':
                    work0 = MAX(work0,3*mn*mn +                 
                                MAX(mx,4*mn*(mn+1))+500);
                    break;
            }
[/bash]

You'll notice that they additionlly add an extra 500 to the size without explanation.
Thanks again!

mecej4 · ‎09-02-2010

Re: You'll notice that they additionlly add an extra 500 to the size without explanation.

If this were about boiler design rather than software, we'd call that the usual "corrosion allowance"!

David_Morgenthaler · ‎09-02-2010

Alas, this fix doesn't quite fix things altogether. Using the corrected code, and testing with m=70, n=70, I get the following result:

[bash]C:Documents and SettingsmorgentdMy DocumentswikipediadgesddTestDebug>dgesddTest.exe
Optimum lwork size = 15190
Minimum lwork size = 20090[/bash]

This happens for a variety of other matrix dimensions too. So I'm leaning toward the approach used by the author of numpy. Thioughts?

mecej4 · ‎09-02-2010

I have looked at the code of DGESDD.f. It is not obvious to me how the "minimum LWORK" formulae in the comments were obtained. As we have seen, they are wrong. The public domain source code of DGESDD computes this value internally and gives an error message if the provided value is too small, but does not provide for returning the computed value to the caller.

The calculation of the optimum LWORK is rather complicated and involves several calls to ILAENV with M,N and other arguments.

(new edits:) The formulae for the minimum LWORK needed are incorrect: those in the MKL documentation for DGESDD as well as those in the comments of dgesdd.f at Netlib.

By running a test driver (in Fortran, see below) with the public domain Lapack 3.2 sources as well as with MKL, for the following input parameter ranges,

JOBZ = 'S'; M, N = 1 to 5000; LDA = M and M + 20; LDU = M and M + 10; LDVT = N and N + 5; LWORK = -1

I arrived at the revised formulae for the minimum LWORK needed (lines 20 to 24).

[fxfortran]      program tdgesdd
      character jobz
      integer info,m,n,mxmn,mnmn,lwork,ifor,mnthr,lda,ldu,ldvt,iwork(2)
      Double Precision work(2)                            !dummy array
      double precision A(1,1),S(1,1),U(1,1),VT(1,1)       !dummy arrays
c
      do m=1,5000
         if(mod(m,100).eq.0)write(*,*)' M = ',m
         do n=1,5000
            JOBZ='S'
            lda=m+15
            ldu=m+10
            ldvt=n+5
            lwork=-1
            Call dgesdd( JOBZ, M, N, A, lda, S, U, ldu, 
     +                   VT,ldvt, WoRK, LWORK, IWORK, INFO )
            MXMN = max(M,N)
            MNMN = min(M,N)
            MNTHR = INT( MNMN*11.0D0 / 6.0D0 )
            IF(MXMN.LT.MNTHR)THEN
               IFOR = 3*mnmn + max(mxMN,3*mnMN*mnMN+4*mnMN)
            ELSE
               IFOR = 4*mnMN*mnMN+7*mnMN
            ENDIF

            if(IFOR.gt.WoRK(1).OR.INFO.NE.0)
     +         write(*,10)m,n,WORK(1),IFOR,INFO
   10 format(1x,2I5,2x,F8.0,I11,2x,I2)
         end do
      end do
      end program tdgesdd
[/fxfortran]

However, since you are making a dry run with LWORK=-1, and you are allocating the work space dynamically in your C code, it should be perfectly safe for you to CALLOC that amount of memory, and completely ignore the "minimum LWORK" estimate given by the formulae.

I do not think that you need to use ad-hoc adjustments such as those that you mentioned w.r.t. the Python adaptation.