Community
cancel
Showing results for 
Search instead for 
Did you mean: 
97 Views

PSGESVD: Illegal parameter 19

Jump to solution

Hello,

I want to perform SVD on a matrix in parallel(C). But I am facing issues in psgesvd. When I try to perform svd of a matrix of size 4750*4750, I get an error saying that:

"{    0,    0}:  On entry to
PSGESVD parameter number   19 had an illegal value"

Parameter 19 is 'lwork' for which a make a query first which returns the minimum size of 'work' array required. The svd works fine for smaller matrices but big matrices give an error. Can anyone please help me with this as to what is the reason for this error. The value of lwork in this case is 45362800.

When I use serial "sgesvd" I don't get any such issue and the results are perfectly correct. But I am not sure what is causing an issue in this case.

Thanks in advance!

-Shailesh Tripathi

0 Kudos
1 Solution
97 Views

Hi Shailesh,

Thank you for reporting the problem! Fortunately, we've already fixed it. The fix will be available into next MKL releases: MKL 2018 Update 4 and MKL 2019.

As you know, when workspace amount is requested in LAPACK and ScaLAPACK (via LWORK=-1), it's returned into WORK(1). For single precision it means "int -> float" conversion. Unfortunately, for numbers more than 2^24 this conversion can lead to losing some low digits (because the integer number will be represented by a closest float number). As a result, users may get an integer number less than it was initially calculated into MKL. To avoid this problem, we now adjust integer numbers appropriately so that they can be represented without loss of precision as float, and then we assign this number to WORK(1).

As a workaround, you can adjust your number into the program as follows:

(*((MKL_INT*)&workpt))++;  // this line does the trick - it adds 1 to workpt represented as integer. In floating point representation it means you increase your number to the next representable fp number. It's enough to be on the safe side.

 lwork = (MKL_INT)workpt;

Sorry for inconvinience!

Konstantin

 

 

View solution in original post

7 Replies
Gennady_F_Intel
Moderator
97 Views

What version of mkl do you use? We have fixed the very similar issue in mkl version 2018 u2.

97 Views

Hello,

Thank you for your response. My ICC version is 13.0.1.117 and I know it is quite outdated. But is it just a bug in the compiler or is it some mistake I am doing? If the first is true, I can upgrade to the latest version.

Also, what was the issue exactly that you fixed in the 2018 u2?

Thanks for your help!

Regards,
Shailesh

 

Gennady_F_Intel
Moderator
97 Views

https://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-2018-bug-fixes-list ; -- MKLD-3350

You may also give us reproducer which we may check on our side with the latest versions/updates.

 

97 Views

Hello,
I am ataching my reproducer code. I made some simplifications for 1 process execution. When the matrix size is large the same error shows up.

Here is the code:

#include <iostream>
#include <mpi.h>
#include <stdlib.h>
#include "mkl_scalapack.h"
#include "mkl_blacs.h"

typedef MKL_INT MDESC[ 9 ];

void allocate(float **array, int size) {
    if (!(*array) && *array == NULL) {
        *array = (float*) malloc (size * sizeof(float));
        if (!(*array)) {
            std::cout << "FAILED TO ALLOCATE MEMORY" << std::endl;
        }
    }
}

void svd(float *A, int& m, int& n) {

    // SVD of A to get U, Vt and S
    float zero = 0.0e+0, one = 1.0e+0;
    MKL_INT i_zero = 0, i_one = 1, i_negone = -1;

    MDESC descA, descU, descVt, descpinvA;

    MKL_INT iam, nprocs, ictxt, myrow, mycol, nprow, npcol;
    MKL_INT mb, nb, mp, nq, mq, kr;
    MKL_INT info, lld, lwork = -1;
    char jobu = 'V', jobvt = 'V';
    float *work, *S = NULL, *U = NULL, *Vt = NULL;
    float workpt;
    int k;
    int i,j; 

    blacs_pinfo_( &iam, &nprocs );
    nprow = 1;
    npcol = 1;

    blacs_get_( &i_negone, &i_zero, &ictxt );
    blacs_gridinit_( &ictxt, "R", &nprow, &npcol );
    blacs_gridinfo_( &ictxt, &nprow, &npcol, &myrow, &mycol );

    mb = 1;
    nb = 1;
    k = std::min(m, n);

    mp = m;
    nq = n;
    kr = k;

    allocate(&S, k);
    allocate(&U, m * kr);
    allocate(&Vt, kr * n);

    descinit_( descA, &m, &n, &nb, &nb, &i_zero, &i_zero, &ictxt, &m, &info );
    descinit_( descU, &m, &k, &nb, &nb, &i_zero, &i_zero, &ictxt, &m, &info );
    descinit_( descVt, &k, &n, &nb, &nb, &i_zero, &i_zero, &ictxt, &k, &info );

    psgesvd(&jobu, &jobvt, &m, &n, A, &i_one, &i_one, descA, S, U, &i_one, &i_one, descU, Vt, &i_one, &i_one, descVt, &workpt, &lwork, &info);

    lwork = (MKL_INT)workpt;

    work = (float*) calloc(lwork, sizeof( float ));
	
    psgesvd(&jobu, &jobvt, &m, &n, A, &i_one, &i_one, descA, S, U, &i_one, &i_one, descU, Vt, &i_one, &i_one, descVt, work, &lwork, &info);

    free(work);
    free(S);
    free(U);
    free(Vt);

    // Destroy temporary process grid
    blacs_gridexit_( &ictxt );
}


int main(int argc, char** argv)
{
	MPI_Init(&argc, &argv);
	
	float *A=NULL;
	int i,j;
	int m= 4760,n=4760 ;
	allocate(&A,m*n);
	
	for(i=0; i < m ; i++)
	{
		for(j=0 ; j<n ; j++)
			A[j*m + i]=rand() / RAND_MAX*10;
	}

	svd(A,m,n);
	MPI_Finalize();
	return 0;
}

This code breaks even with a single process.
Thanks in advance!

-Shailesh

Gennady_F_Intel
Moderator
97 Views

yes, we sees the same behavior with the latest 2018 u2. We will investigate the cause of the problem and keep you in the loop with results.

98 Views

Hi Shailesh,

Thank you for reporting the problem! Fortunately, we've already fixed it. The fix will be available into next MKL releases: MKL 2018 Update 4 and MKL 2019.

As you know, when workspace amount is requested in LAPACK and ScaLAPACK (via LWORK=-1), it's returned into WORK(1). For single precision it means "int -> float" conversion. Unfortunately, for numbers more than 2^24 this conversion can lead to losing some low digits (because the integer number will be represented by a closest float number). As a result, users may get an integer number less than it was initially calculated into MKL. To avoid this problem, we now adjust integer numbers appropriately so that they can be represented without loss of precision as float, and then we assign this number to WORK(1).

As a workaround, you can adjust your number into the program as follows:

(*((MKL_INT*)&workpt))++;  // this line does the trick - it adds 1 to workpt represented as integer. In floating point representation it means you increase your number to the next representable fp number. It's enough to be on the safe side.

 lwork = (MKL_INT)workpt;

Sorry for inconvinience!

Konstantin

 

 

View solution in original post

97 Views

It works perfectly fine!
Thank you for your help!

-Shailesh

Reply