Are LAPACKE_cgesdd and LAPACKE_cgesvd SVD calculations reliable?

Angelo_L_ · ‎05-20-2017

I'm using LAPACKE_cgesdd and LAPACKE_cgesvd to compute the singular values of a matrix. Both the routines have the option to compute the singular values only. The problem I have is that, in the following four test cases:

Full SVD with LAPACKE_cgesdd;
Full SVD with LAPACKE_cgesvd;
Singular values only with LAPACKE_cgesdd;
Singular values only with LAPACKE_cgesvd.

I receive different singular values. In particular:

Test, 3 x 4 matrix

    a[0].real(5.91); a[0].imag(-5.96);
    a[1].real(7.09); a[1].imag(2.72);
    a[2].real(7.78); a[2].imag(-4.06);
    a[3].real(-0.79); a[3].imag(-7.21);
    a[4].real(-3.15); a[4].imag(-4.08);
    a[5].real(-1.89); a[5].imag(3.27);
    a[6].real(4.57); a[6].imag(-2.07);
    a[7].real(-3.88); a[7].imag(-3.30);
    a[8].real(-4.89); a[8].imag(4.20);
    a[9].real(4.10); a[9].imag(-6.70);
    a[10].real(3.28); a[10].imag(-3.84);
    a[11].real(3.84); a[11].imag(1.19);

Full SVD with LAPACKE_cgesdd

   17.8592720031738
   11.4463796615601
   6.74482488632202

Full SVD with LAPACKE_cgesvd

   17.8651084899902
   11.3695945739746
   6.83876800537109

Singular values only with LAPACKE_cgesdd

   17.8592758178711
   11.4463806152344
   6.74482440948486

Singular values only with LAPACKE_cgesvd

   17.8705902099609
   11.5145053863525
   6.82878828048706

As it can be seen, even for the same routine, the results can change since the third significant digit when switching from full SVD to singular values only.

My questions are:

Is this reasonable?

Am I doing something wrong?

Thank you in advance for any help.

This is the code I'm using:

    #include <stdlib.h>
    #include <stdio.h>
    #include <algorithm>    // std::min
    #include <time.h>
    #include <complex>
    #include <mkl.h>
    #include "mkl_lapacke.h"
    
    #include "TimingCPU.h"
    #include "InputOutput.h"
    
    //#define FULL_SVD
    #define PRINT_MATRIX
    #define PRINT_SINGULAR_VALUES
    //#define PRINT_LEFT_SINGULAR_VECTORS
    //#define PRINT_RIGHT_SINGULAR_VECTORS
    #define SAVE_MATRIX
    #define SAVE_SINGULAR_VALUES
    //#define SAVE_LEFT_SINGULAR_VECTORS
    //#define SAVE_RIGHT_SINGULAR_VECTORS
    
    #define GESDD
    //#define GESVD
    
    /*************************************************************/
    /* PRINT A SINGLE PRECISION COMPLEX MATRIX STORED COLUMNWISE */
    /*************************************************************/
    void print_matrix_col(char const *desc, MKL_INT Ncols, MKL_INT Nrows, std::complex<float>* a, MKL_INT LDA) {
        printf( "\n %s\n[", desc);
        for(int i = 0; i < Ncols; i++) {
            for(int j = 0; j < Nrows; j++)
                printf("(%6.2f,%6.2f)", a[i * LDA + j].real(), a[i * LDA + j].imag());
            printf( "\n" );
        }
    }
    
    /**********************************************************/
    /* PRINT A SINGLE PRECISION COMPLEX MATRIX STORED ROWWISE */
    /**********************************************************/
    void print_matrix_row(char const *desc, int Nrows, int Ncols, std::complex<float>* a, int LDA) {
        printf( "\n %s\n", desc);
        for (int i = 0; i < Ncols; i++) {
            for (int j = 0; j < Ncols; j++)
                printf("%2.10f + 1i * %2.10f ", a[i * LDA + j].real(), a[i * LDA + j].imag());
            printf( ";\n" );
        }
    }
    
    /****************************************/
    /* PRINT A SINGLE PRECISION REAL MATRIX */
    /****************************************/
    void print_rmatrix(char const *desc, MKL_INT m, MKL_INT n, float* a, MKL_INT lda ) {
        MKL_INT i, j;
        printf( "\n %s\n", desc );
        for( i = 0; i < m; i++ ) {
            for( j = 0; j < n; j++ ) printf( " %6.2f", a[i*lda+j] );
            printf( "\n" );
        }
    }
    
    /********/
    /* MAIN */
    /********/
    int main() {
    
        const int Nrows = 3;            // --- Number of rows
        const int Ncols = 4;            // --- Number of columns
        const int LDA   = Ncols;
        const int LDU   = Nrows;
        const int LDVT  = Ncols;    
    
        const int numRuns   = 20;        // --- Number of runs for timing
    
        TimingCPU timerCPU;    
    
        // --- Allocating space and initializing the input matrix
        std::complex<float> *a = (std::complex<float> *)malloc(Nrows * Ncols * sizeof(std::complex<float>));
        srand(time(NULL));
    //    for (int k = 0; k < Nrows * Ncols; k++) {
    //        a.real((float)rand() / (float)(RAND_MAX));    
    //        a.imag((float)rand() / (float)(RAND_MAX));    
    //    }
        a[0].real(5.91); a[0].imag(-5.96);
        a[1].real(7.09); a[1].imag(2.72);
        a[2].real(7.78); a[2].imag(-4.06);
        a[3].real(-0.79); a[3].imag(-7.21);
        a[4].real(-3.15); a[4].imag(-4.08);
        a[5].real(-1.89); a[5].imag(3.27);
        a[6].real(4.57); a[6].imag(-2.07);
        a[7].real(-3.88); a[7].imag(-3.30);
        a[8].real(-4.89); a[8].imag(4.20);
        a[9].real(4.10); a[9].imag(-6.70);
        a[10].real(3.28); a[10].imag(-3.84);
        a[11].real(3.84); a[11].imag(1.19);
    
        // --- Allocating space for the singular vector matrices
        #ifdef FULL_SVD    
        std::complex<float> *u  = (std::complex<float> *)malloc(Nrows * LDU  * sizeof(std::complex<float>));
        std::complex<float> *vt = (std::complex<float> *)malloc(Ncols * LDVT * sizeof(std::complex<float>));
        #endif
    
        // --- Allocating space for the singular values 
        float *s = (float *)malloc(std::min(Nrows, Ncols) * sizeof(float));
        
        #ifdef GESVD
        float *superb = (float *)malloc((std::min(Nrows, Ncols) - 1) * sizeof(float));
        #endif
    
        // --- Print and/or save input matrix
        #ifdef PRINT_MATRIX    
        print_matrix_row("Matrix (stored rowwise)", Ncols, Nrows, a, LDA);
        #endif
        #ifdef SAVE_MATRIX
        saveCPUcomplextxt(a, "/home/angelo/Project/SVD/MKL/a.txt", Ncols * Nrows);
        #endif
    
        // --- Compute singular values
        MKL_INT info;
        float   timing = 0.f;    
        for (int k = 0; k < numRuns; k++) {
            timerCPU.StartCounter();
            // --- The content of the input matrix a is destroyed on output
            #if defined(FULL_SVD) && defined(GESDD)
            printf("Running Full SVD - GESDD\n");
            MKL_INT info = LAPACKE_cgesdd(LAPACK_ROW_MAJOR, 'A', Nrows, Ncols, (MKL_Complex8 *)a, LDA, s, (MKL_Complex8 *)u, LDU, (MKL_Complex8 *)vt, LDVT);
            #endif        
            #if !defined(FULL_SVD) && defined(GESDD)        
            printf("Running singular values only - GESDD\n");
            MKL_INT info = LAPACKE_cgesdd(LAPACK_ROW_MAJOR, 'N', Nrows, Ncols, (MKL_Complex8 *)a, LDA, s, NULL, LDU, NULL, LDVT);
            #endif
            #if defined(FULL_SVD) && defined(GESVD)        
            printf("Running Full SVD - GESVD\n");
            MKL_INT info = LAPACKE_cgesvd(LAPACK_ROW_MAJOR, 'A', 'A', Nrows, Ncols, (MKL_Complex8 *)a, LDA, s,
             (MKL_Complex8 *)u, LDU, (MKL_Complex8 *)vt, LDVT, superb);
            #endif
            #if !defined(FULL_SVD) && defined(GESVD)
            printf("Running singular values only - GESVD\n");
            MKL_INT info = LAPACKE_cgesvd(LAPACK_ROW_MAJOR, 'N', 'N', Nrows, Ncols, (MKL_Complex8 *)a, LDA, s,
             NULL, LDU, NULL, LDVT, superb);
            #endif
            if(info > 0) { // --- Check for convergence
                printf( "The algorithm computing SVD failed to converge.\n" );
                exit(1);
            }
            timing = timing + timerCPU.GetCounter();
        }
        printf("Timing = %f\n", timing / numRuns);
        
        // --- Print and/or save singular values
        #ifdef PRINT_SINGULAR_VALUES
        print_rmatrix("Singular values", 1, Ncols, s, 1);
        #endif
        #ifdef SAVE_SINGULAR_VALUES    
        saveCPUrealtxt(s, "/home/angelo/Project/SVD/MKL/s.txt", std::min(Nrows, Ncols));
        #endif
    
        // --- Print left singular vectors
        #ifdef PRINT_LEFT_SINGULAR_VECTORS    
        print_matrix_col("Left singular vectors (stored columnwise)", Ncols, Ncols, u, LDU);
        #endif
        #if defined(FULL_SVD) && defined(SAVE_LEFT_SINGULAR_VECTORS) 
        saveCPUcomplextxt(u, "/home/angelo/Project/SVD/MKL/u.txt", Nrows * LDU);
        #endif
    
        // --- Print right singular vectors
        #ifdef PRINT_RIGHT_SINGULAR_VECTORS    
        print_matrix_col("Right singular vectors (stored rowwise)", Ncols, Nrows, vt, LDVT);
        #endif
        #if defined(FULL_SVD) && defined(SAVE_RIGHT_SINGULAR_VECTORS) 
        saveCPUcomplextxt(vt, "/home/angelo/Project/SVD/MKL/vt.txt", Ncols * LDVT);
        #endif
    
        exit(0);
    }

compiled as

    g++ -DMKL_ILP64 -fopenmp -m64 -I$MKLROOT/include svdMKLComplexSingle.cpp TimingCPU.cpp InputOutput.cpp -L$MKLROOT/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm -fno-exceptions

Zhen_Z_Intel · ‎05-21-2017

Hi Angelo,

These two functions are using different algorithm to calculate, that ?GESVD uses QR iteration but ?GESDD adopts divide and conquer method. I believe there would be slightly different between different methodologies. For ?GESDD with jobz 'N', the kernel will do QR factorization first and then, Bidigonalize R in A, and then perform bigiagonal SVD to compute singular value directly by sbdsdc with job 'N'. But ?GESDD with jobz 'A', will generate Q, and Vt, and perform bidiagonal SVD, computing left singular vectors of bidiagonal matrix in rwork and computing right singular vectors of bidiagonal matrix in rwork. If the result of calling sbdsdc by different job is different, it is reasonable the results of ?GESDD with different jobs are different.

Best regards,
Fiona

Angelo_L_ · ‎05-22-2017

Thanks Fiona for your answer.

Let me just notice that they are not small differences, even for the same routine when computing the singular vectors or not. The differences are on the second/third significant digit. You say this is possible/reasonable, Am I understanding correctly?

Thank you again.

Konstantin_A_Intel · ‎05-23-2017

Hi Angelo,

I believe this is an acceptable result taking into account you're using single precision routines. If you like to get more precise results, please use double complex precision.

Regards,

Konstantin

Eugene_C_Intel1 · ‎05-23-2017

Hi Angelo,

Actually such kind of difference is not acceptable even for single precision because elements of the input matrix are not near underflow/overflow and its singular values are not clustered.

Please refer to documentation of ?GESVD and ?GESDD:

It says that contents of A is destroyed when JOB* is not equal to 'O'. But you iterate in loop through numRuns iterations calling SVD not for initial matrix but for some contents that was used by SVD in previous iterations. So finally you see singular values of *some matrix* with accumulated error through all these iterations. So you need to keep initial matrix and call SVD for its copy to avoid referencing destroyed contents.

For example for CGESVD, when I set numRuns to 1 I can see the following singular values:

Running singular values only - GESVD
Timing = 0.000000

 Singular values
 17.7436085 11.5775537  6.7621651  0.0000000
Running Full SVD - GESVD
Timing = 0.000000

 Singular values
 17.7436085 11.5775509  6.7621646  0.0000000

The difference is about 3*10^-6 what looks more plausible for single precision.

Also I noticed that you allocate min(Nrows, Ncols) - 1 elements for singular values to call CGESVD. But this array should have at least min(Nrows, Ncols) elements.

Eugene

Konstantin_A_Intel · ‎05-24-2017

Indeed, Eugene is right. I did not notice that the program is not absolutely correct.

Konstantin

Angelo_L_ · ‎05-25-2017

Thanks to Eugene and Konstantin for their answers.

I have fixed my code according to the first Eugene's comment. Now it is

//g++ -DMKL_ILP64 -fopenmp -m64 -I$MKLROOT/include svdMKLComplexSingle.cpp TimingCPU.cpp InputOutput.cpp -L$MKLROOT/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm -fno-exceptions

// https://software.intel.com/en-us/node/469238
// https://software.intel.com/en-us/node/521150

// MATLAB calls DGESVD

#include <stdlib.h>
#include <stdio.h>
#include <algorithm>    // std::min
#include <time.h>
#include <complex>
#include <string.h>
#include <mkl.h>
#include "mkl_lapacke.h"

#include "TimingCPU.h"
#include "InputOutput.h"

#define FULL_SVD
//#define PRINT_MATRIX
#define PRINT_SINGULAR_VALUES
//#define PRINT_LEFT_SINGULAR_VECTORS
//#define PRINT_RIGHT_SINGULAR_VECTORS
//#define SAVE_MATRIX
//#define SAVE_SINGULAR_VALUES
//#define SAVE_LEFT_SINGULAR_VECTORS
//#define SAVE_RIGHT_SINGULAR_VECTORS

#define GESDD
//#define GESVD

/*************************************************************/
/* PRINT A SINGLE PRECISION COMPLEX MATRIX STORED COLUMNWISE */
/*************************************************************/
void print_matrix_col(char const *desc, MKL_INT Ncols, MKL_INT Nrows, std::complex<float>* a, MKL_INT LDA) {
    printf( "\n %s\n[", desc);
    for(int i = 0; i < Ncols; i++) {
        for(int j = 0; j < Nrows; j++)
            printf("(%6.2f,%6.2f)", a[i * LDA + j].real(), a[i * LDA + j].imag());
        printf( "\n" );
    }
}

/**********************************************************/
/* PRINT A SINGLE PRECISION COMPLEX MATRIX STORED ROWWISE */
/**********************************************************/
void print_matrix_row(char const *desc, int Nrows, int Ncols, std::complex<float>* a, int LDA) {
    printf( "\n %s\n", desc);
    for (int i = 0; i < Ncols; i++) {
        for (int j = 0; j < Ncols; j++)
            printf("%2.10f + 1i * %2.10f ", a[i * LDA + j].real(), a[i * LDA + j].imag());
        printf( ";\n" );
    }
}

/****************************************/
/* PRINT A SINGLE PRECISION REAL MATRIX */
/****************************************/
void print_rmatrix(char const *desc, MKL_INT m, MKL_INT n, float* a, MKL_INT lda ) {
    MKL_INT i, j;
    printf( "\n %s\n", desc );
    for( i = 0; i < m; i++ ) {
        for( j = 0; j < n; j++ ) printf( " %6.12f", a[i*lda+j] );
        printf( "\n" );
    }
}

/********/
/* MAIN */
/********/
int main() {

    const int Nrows = 3;            // --- Number of rows
    const int Ncols = 4;            // --- Number of columns
    const int LDA   = Ncols;
    const int LDU   = Nrows;
    const int LDVT  = Ncols;    

    const int numRuns   = 20;        // --- Number of runs for timing

    TimingCPU timerCPU;    

    // --- Allocating space and initializing the input matrix
    std::complex<float> *a = (std::complex<float> *)malloc(Nrows * Ncols * sizeof(std::complex<float>));
    std::complex<float> *a_temp = (std::complex<float> *)malloc(Nrows * Ncols * sizeof(std::complex<float>));
//    srand(time(NULL));
//    for (int k = 0; k < Nrows * Ncols; k++) {
//        a.real((float)rand() / (float)(RAND_MAX));    
//        a.imag((float)rand() / (float)(RAND_MAX));    
//    }
    a[0].real(5.91); a[0].imag(-5.96);
    a[1].real(7.09); a[1].imag(2.72);
    a[2].real(7.78); a[2].imag(-4.06);
    a[3].real(-0.79); a[3].imag(-7.21);
    a[4].real(-3.15); a[4].imag(-4.08);
    a[5].real(-1.89); a[5].imag(3.27);
    a[6].real(4.57); a[6].imag(-2.07);
    a[7].real(-3.88); a[7].imag(-3.30);
    a[8].real(-4.89); a[8].imag(4.20);
    a[9].real(4.10); a[9].imag(-6.70);
    a[10].real(3.28); a[10].imag(-3.84);
    a[11].real(3.84); a[11].imag(1.19);

    // --- Allocating space for the singular vector matrices
    #ifdef FULL_SVD    
    std::complex<float> *u  = (std::complex<float> *)malloc(Nrows * LDU  * sizeof(std::complex<float>));
    std::complex<float> *vt = (std::complex<float> *)malloc(Ncols * LDVT * sizeof(std::complex<float>));
    #endif

    // --- Allocating space for the singular values 
    float *s = (float *)malloc(std::min(Nrows, Ncols) * sizeof(float));
    
    #ifdef GESVD
    float *superb = (float *)malloc((std::min(Nrows, Ncols) - 1) * sizeof(float));
    #endif

    // --- Print and/or save input matrix
    #ifdef PRINT_MATRIX    
    print_matrix_row("Matrix (stored rowwise)", Ncols, Nrows, a, LDA);
    #endif
    #ifdef SAVE_MATRIX
    saveCPUcomplextxt(a, "/home/angelo/Project/SVD/MKL/a.txt", Ncols * Nrows);
    #endif

    // --- Compute singular values
    MKL_INT info;
    float   timing = 0.f;    
    for (int k = 0; k < numRuns; k++) {
        memcpy(a_temp, a, Nrows * Ncols * sizeof(std::complex<float>));
        timerCPU.StartCounter();
        // --- The content of the input matrix a is destroyed on output
        #if defined(FULL_SVD) && defined(GESDD)
        printf("Running Full SVD - GESDD\n");
        MKL_INT info = LAPACKE_cgesdd(LAPACK_ROW_MAJOR, 'A', Nrows, Ncols, (MKL_Complex8 *)a_temp, LDA, s, (MKL_Complex8 *)u, LDU, (MKL_Complex8 *)vt, LDVT);
        #endif        
        #if !defined(FULL_SVD) && defined(GESDD)        
        printf("Running singular values only - GESDD\n");
        MKL_INT info = LAPACKE_cgesdd(LAPACK_ROW_MAJOR, 'N', Nrows, Ncols, (MKL_Complex8 *)a_temp, LDA, s, NULL, LDU, NULL, LDVT);
        #endif
        #if defined(FULL_SVD) && defined(GESVD)        
        printf("Running Full SVD - GESVD\n");
        MKL_INT info = LAPACKE_cgesvd(LAPACK_ROW_MAJOR, 'A', 'A', Nrows, Ncols, (MKL_Complex8 *)a_temp, LDA, s,
         (MKL_Complex8 *)u, LDU, (MKL_Complex8 *)vt, LDVT, superb);
        #endif
        #if !defined(FULL_SVD) && defined(GESVD)
        printf("Running singular values only - GESVD\n");
        MKL_INT info = LAPACKE_cgesvd(LAPACK_ROW_MAJOR, 'N', 'N', Nrows, Ncols, (MKL_Complex8 *)a_temp, LDA, s,
         NULL, LDU, NULL, LDVT, superb);
        #endif
        if(info > 0) { // --- Check for convergence
            printf( "The algorithm computing SVD failed to converge.\n" );
            exit(1);
        }
        timing = timing + timerCPU.GetCounter();
    }
    printf("Timing = %f\n", timing / numRuns);
    
    // --- Print and/or save singular values
    #ifdef PRINT_SINGULAR_VALUES
    print_rmatrix("Singular values", 1, std::min(Nrows,Ncols), s, 1);
    #endif
    #ifdef SAVE_SINGULAR_VALUES    
    saveCPUrealtxt(s, "/home/angelo/Project/SVD/MKL/s.txt", std::min(Nrows, Ncols));
    #endif

    // --- Print left singular vectors
    #ifdef PRINT_LEFT_SINGULAR_VECTORS    
    print_matrix_col("Left singular vectors (stored columnwise)", Ncols, Ncols, u, LDU);
    #endif
    #if defined(FULL_SVD) && defined(SAVE_LEFT_SINGULAR_VECTORS) 
    saveCPUcomplextxt(u, "/home/angelo/Project/SVD/MKL/u.txt", Nrows * LDU);
    #endif

    // --- Print right singular vectors
    #ifdef PRINT_RIGHT_SINGULAR_VECTORS    
    print_matrix_col("Right singular vectors (stored rowwise)", Ncols, Nrows, vt, LDVT);
    #endif
    #if defined(FULL_SVD) && defined(SAVE_RIGHT_SINGULAR_VECTORS) 
    saveCPUcomplextxt(vt, "/home/angelo/Project/SVD/MKL/vt.txt", Ncols * LDVT);
    #endif

    exit(0);
}

The result is now stable up to 6 significant digits, namely,

CGESVD Singular values only
17.743608474731 11.577552795410 6.762166023254

CGESVD Full SVD
17.743610382080 11.577552795410 6.762166976929

CGESDD Singular values only
17.743608474731 11.577551841736 6.762166500092

CGESDD Full SVD
17.743612289429 11.577552795410 6.762166023254

However, Matlab provides a different result. If I use

clear all
close all
clc

a = [ 5.91 - 1i * 5.96;
	  7.09 + 1i * 2.72;
	  7.78 - 1i * 4.06;
	 -0.79 - 1i * 7.21;
	 -3.15 - 1i * 4.08;
	 -1.89 + 1i * 3.27;
	  4.57 - 1i * 2.07;
	 -3.88 - 1i * 3.30;
	 -4.89 + 1i * 4.20;
	  4.10 - 1i * 6.70;
	  3.28 - 1i * 3.84;
	  3.84 + 1i * 1.19];
  
  a = reshape(a, 3, 4);
  
  S = svd(a);

then the singular values are

17.305197592928174
13.617559069765090
  3.113602616129962

Is there any other problem with my code?

Also, Eugene, I do not understand your statement:

Also I noticed that you allocate min(Nrows, Ncols) - 1 elements for singular values to call CGESVD. But this array should have at least min(Nrows, Ncols) elements.

Do you refer to

	float *s = (float *)malloc(std::min(Nrows, Ncols) * sizeof(float));

or to

	#ifdef GESVD
	float *superb = (float *)malloc((std::min(Nrows, Ncols) - 1) * sizeof(float));
	#endif

The documentation says that superb should have min(Nrows, Ncols) - 1, am I right?

Thank you again for any help.

Zhen_Z_Intel · ‎06-05-2017

Hi,

Matlab is column major for reshape function, but in MKL program you set to row-major. Please change to row major matrix to test in Matlab:

b=double([5.91;7.09;7.78;-0.79;-3.15;-1.89;4.57;-3.88;-4.89;4.10;3.28;3.84])
c=double([-5.96;2.72;-4.06;-7.21;-4.08;3.27;-2.07;-3.30;4.20;-6.70;-3.84;1.19])
a=complex(b,c)
a=reshape(a,4,3)
a=reshape(a',3,4)
S=svd(a)

And the result would be sam as MKL:

S =

17.7436
11.5776
6.762

Ying_H_Intel · ‎06-05-2017

Add another comment:

Right, the S should be at least min(Nrows, Ncols) elements. and the superb min(Nrows, Ncols) - 1 should be OK. in documentation, it mentioned

If ?bdsqr does not converge (indicated by the return value info > 0), on exit superb(0:min(m,n)-2) contains the unconverged superdiagonal elements of an upper bidiagonal matrix B whose diagonal is in s (not necessarily sorted).

Best Regards,

Ying