Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Static vs. dynamic vector input

vincent_ferri
Beginner
1,250 Views

Hi,

I am having an issue with dynamic input arguements to the following functions;

LAPACKE_dgelss();

dgetrf();dgetri();

ippmInvert_m_64f();

When the input vector is static A[10 * 10] = {...}; the output is correct namely A inverse.  If the same data values are read into a dynamic vector the output is incorrect. Why am I getting this anomally?

Thanks

Vince

0 Kudos
23 Replies
Zhang_Z_Intel
Employee
1,090 Views

With the very limited information you provided, it's hard to reproduce the problem or guess an explanation. Would you provide a small reproducer? Or, at least show a code snippet calling these functions using static and dynamic arrays? Thanks.

0 Kudos
vincent_ferri
Beginner
1,090 Views

Hi,

I will concentrate on one of the functions that I listed, it is used for LU factorization;

 lapack_int info;
 MKL_INT* ipiv;
 double dVandSize = 10;

 double* c = ( double * ) malloc ( dVandSize * dVandSize * sizeof ( double ) );

//If input vector c[10 * 10] is a static array with initialized values the function works, if c is dynamic and contains the same values it doesn't work; I included a file called c_vector that contains c;


 ipiv = ( MKL_INT * ) malloc ( dVandSize * sizeof ( MKL_INT ) );
 dgetrf(&dVandSize,&dVandSize,c,&dVandSize,ipiv,&info);  //Computes the LU factorization


 double* workspace = new double [dVandSize* sizeof(double)];

  dgetri(&dVandSize, c, &dVandSize, ipiv, workspace, &dVandSize, &info);

Thanks

0 Kudos
Zhang_Z_Intel
Employee
1,090 Views

Well, I was not able to reproduce the problem. Both static and dynamic arrays worked fine and gave identical results. See my test code attached.

But a careful look at your code snippet revealed this problem:

[cpp]double dVandSize = 10;[/cpp]

Why was this variable declared as double when it should be an integer? Didn't you get compiler warnings?

0 Kudos
SergeyKostrov
Valued Contributor II
1,090 Views
>>... I was not able to reproduce the problem... There are differences in initializations and take a look: [ This is how Vincent initializes ] ... double *c = ( double * )malloc( dVandSize * dVandSize * sizeof ( double ) ); ... ipiv = ( MKL_INT * )malloc( dVandSize * sizeof ( MKL_INT ) ); ... double *workspace = new double [ dVandSize * sizeof( double ) ]; // Note: C++ operator new is used ... [ This is how Zhang initializes ] ... double *c = ( double * )malloc( dVandSize * dVandSize * sizeof( double ) ); ... ipiv = ( MKL_INT * )malloc( dVandSize * sizeof( MKL_INT ) ); ... double *workspace = ( double * )malloc( dVandSize * sizeof( double ) ); // Note: CRT-function malloc is used ... Vincent, my question is Why do you need sizeof( double ) in new double [ dVandSize * sizeof( double ) ]?
0 Kudos
SergeyKostrov
Valued Contributor II
1,090 Views
Results are absolutely identical and please take a look: [ Output when CRT-function 'malloc' is used ] Intel(R) Math Kernel Library Version 10.3.12 Product Build 20120831 for 32-bit applications Major version : 10 Minor version : 3 Update version : 12 Product status : Product Build : 20120831 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 [ Output when C++ operator 'new' is used ] Intel(R) Math Kernel Library Version 10.3.12 Product Build 20120831 for 32-bit applications Major version : 10 Minor version : 3 Update version : 12 Product status : Product Build : 20120831 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186
0 Kudos
SergeyKostrov
Valued Contributor II
1,090 Views
// Sub-Test 1 - Gets MKL version { ///* MKLVersion Ver = { 0x0 }; int iLenData = 256; char szVerData[256] = { 0x0 }; MKL_Get_Version_String( szVerData, iLenData ); CrtPrintfA( "\n%s\n", szVerData ); MKL_Get_Version( &Ver ); printf( "Major version : %d\n", Ver.MajorVersion ); printf( "Minor version : %d\n", Ver.MinorVersion ); printf( "Update version : %d\n", Ver.UpdateVersion ); printf( "Product status : %s\n", Ver.ProductStatus ); printf( "Build : %s\n", Ver.Build ); printf( "\n" ); //*/ } // Sub-Test 2 - Test for dgetrf and dgetri functions { ///* double data[] = { +4.00e+000, +1.50e+001 , +4.00e+001 , +8.50e+001 , +1.56e+002 , +2.59e+002, +4.00e+002 , +5.85e+002 , +8.20e+002, +1.11e+003, +1.50e+001, +8.50e+001, +2.59e+002 , +5.85e+002 , +1.11e+003 , +1.89e+003 , +2.96e+003 , +4.37e+003 , +6.18e+003 , +8.42e+003, +4.00e+001, +2.59e+002 , +8.20e+002 , +1.89e+003 , +3.62e+003 , +6.18e+003 , +9.72e+003 , +1.44e+004 , +2.04e+004 , +2.79e+004, +8.50e+001, +5.85e+002 , +1.89e+003 , +4.37e+003 , +8.42e+003 , +1.44e+004 , +2.28e+004 , +3.38e+004 , +4.80e+004 , +6.56e+004, +1.56e+002 , +1.11e+003 , +3.62e+003 , +8.42e+003 , +1.63e+004 , +2.79e+004 , +4.41e+004, +6.56e+004 , +9.32e+004, +1.28e+005, +2.59e+002, +1.89e+003, +6.18e+003 , +1.44e+004 ,+2.79e+004 ,+4.80e+004 , +7.59e+004, +1.13e+005 , +1.60e+005 , +2.20e+005, +4.00e+002 , +2.96e+003 , +9.72e+003 , +2.28e+004 , +4.41e+004 , +7.59e+004 , +1.20e+005 , +1.79e+005 , +2.54e+005 , +3.48e+005, +5.85e+002 , +4.37e+003 , +1.44e+004 , +3.38e+004 , +6.56e+004 , +1.13e+005 , +1.79e+005 , +2.66e+005 , +3.79e+005 , +5.18e+005, +8.20e+002 , +6.18e+003 , +2.04e+004 , +4.80e+004 , +9.32e+004 , +1.60e+005 , +2.54e+005 , +3.79e+005 , +5.38e+005 , +7.37e+005, +1.11e+003, +8.42e+003 , +2.79e+004 ,+6.56e+004 ,+1.28e+005 ,+2.20e+005 , +3.48e+005 , +5.18e+005 , +7.37e+005 , +1.01e+006 }; lapack_int info = 0; MKL_INT *ipiv = NULL; MKL_INT dVandSize = 10; MKL_INT i; // double *c = data; // double *c = ( double * )malloc( dVandSize * dVandSize * sizeof( double ) ); double *c = ( double * )new double[ dVandSize * dVandSize ]; for( i = 0; i < dVandSize * dVandSize; i++ ) { c = data; } // ipiv = ( MKL_INT * )malloc( dVandSize * sizeof( MKL_INT ) ); ipiv = ( MKL_INT * )new MKL_INT[ dVandSize ]; dgetrf( &dVandSize, &dVandSize, c, &dVandSize, ipiv, &info ); if( info != 0 ) { printf( "DGETRF INFO: %d\n", info ); exit( 1 ); } // double *workspace = ( double * )malloc( dVandSize * sizeof( double ) ); double *workspace = ( double * )new double[ dVandSize ]; dgetri( &dVandSize, c, &dVandSize, ipiv, workspace, &dVandSize, &info ); if( info != 0 ) { printf( "DGETRF INFO: %d\n", info ); exit( 1 ); } for( i = 0; i < dVandSize * dVandSize; i++ ) { printf( "% lf ", c ); if( ( (i+1) % 10 ) == 0 ) printf( "\n" ); } // if( workspace != NULL ) // free( workspace ); // if( ipiv != NULL ) // free( ipiv ); // if( c != NULL ) // free( c ); if( workspace != NULL ) delete workspace; if( ipiv != NULL ) delete ipiv; if( c != NULL ) delete c; printf( "\n" ); //*/ }
0 Kudos
vincent_ferri
Beginner
1,090 Views

But all you did was take a static vector and copy it to a dynamic vector, this works for me too.  But how about using

cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, dVandSize,ldB, iStrideB, beta , b, iStrideB, a, ldA, alpha, c, ldC); as indicated in my last post, take this c vector and put it into dgetrf() dgetri( ).

Regards,

Vince

0 Kudos
Zhang_Z_Intel
Employee
1,090 Views

Vince,

What do you mean by "c vector"? How is it different than a staitc vector and a dynamic vector? And what does cblas_dgemm have to do with this? Instead of having all of us guessing what you want, it would be much easier to post your whole test code here, please?

By the way, have you got a chance to look at the issue pointed out by other replies on this post? Why is 'dVandSize' a double floating point variable? If you follow DGETRF and DGETRI signatures, this argument should be an integer. Have you tried to make it an integer? Does this solve the problem?

0 Kudos
vincent_ferri
Beginner
1,090 Views

Hi

the c vector is the dyamic vector that you created, and cblas_dgemm () uses the a vector and b vector to produce the c vector and that is what you use for  LU. I have given the 'a' and 'b' vectors in the file c_vector.txt

Regards,

Vince

0 Kudos
Zhang_Z_Intel
Employee
1,090 Views

I believe you certainly have taken care of this, and it's probably not related to your original question. But just in case ..., the matrix order in cblas_dgemm can be either row major or column major, but dgetrf and dgetri assume column major matrix order as they are FORTRAN routines.

I'll take another look at it and let you know.

0 Kudos
vincent_ferri
Beginner
1,090 Views

int iVandSize = 10;

double  alpha = 1.0;
 double  beta = 1.0;
 int ldA = iVandSize;
 int ldB = iVandSize;
 int ldC = iVandSize;
 int iStrideB = 4;

 


//cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, iVandSize,ldB, iStrideB, beta , b, iStrideB, a, ldA, alpha, c, ldC); 

Thansk

0 Kudos
SergeyKostrov
Valued Contributor II
1,090 Views
>>...Instead of having all of us guessing what you want, it would be much easier to post your whole test code here, please?.. Vincent, We're trying to help you and please provide as more as possible technical details, like complete codes ( not snippets ), MKL version / update, platform ( OS ), C/C++ compiler, command line options, IDE, etc. OK? Since I've already created my own test case I'll do another verification with the latest version of MKL ( 11 ) on a 64-bit Windows platform.
0 Kudos
SergeyKostrov
Valued Contributor II
1,090 Views
Application - IccTestApp - WIN32_ICC - Debug Tests: Start > Test1153 Start < Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130123 for 32-bit applications Major version : 11 Minor version : 0 Update version : 2 Product status : Product Build : 20130123 Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) Enabled Processor 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 > Test1153 End < Tests: Completed // Application - IccTestApp - WIN32_ICC - Release Tests: Start > Test1153 Start < Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130123 for 32-bit applications Major version : 11 Minor version : 0 Update version : 2 Product status : Product Build : 20130123 Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) Enabled Processor 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 > Test1153 End < Tests: Completed // Application - IccTestApp - WIN32_ICC - Debug Tests: Start > Test1153 Start < Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130124 for Intel(R) 64 architecture applications Major version : 11 Minor version : 0 Update version : 2 Product status : Product Build : 20130124 Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) Enabled Processor 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 > Test1153 End < Tests: Completed // Application - IccTestApp - WIN32_ICC - Release Tests: Start > Test1153 Start < Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130124 for Intel(R) 64 architecture applications Major version : 11 Minor version : 0 Update version : 2 Product status : Product Build : 20130124 Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) Enabled Processor 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 > Test1153 End < Tests: Completed
0 Kudos
Zhang_Z_Intel
Employee
1,090 Views

Vince,

The matrix produced by cblas_dgemm is very different than the original static matrix you provided. After cblas_dgemm, if you compare the result against the orignal static matrix, the root mean square error is more than 1.5e+02. Therefore, the inputs to the dgetrf call and the subsequent dgetri call are different, and different results are expected.

0 Kudos
vincent_ferri
Beginner
1,090 Views

Hi,

When I use dgemm with vector 'a' and 'b' provided from the file you do not get 'c' with the same data as provided, that makes no sense becasue 'c' is a copy and paste from that function, the order of arguements are;

cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, iVandSize,ldB, iStrideB, beta , b, iStrideB, a, ldA, alpha, c, ldC); 

Regards

0 Kudos
Zhang_Z_Intel
Employee
1,090 Views

vincent.ferri wrote:

Hi,

When I use dgemm with vector 'a' and 'b' provided from the file you do not get 'c' with the same data as provided,

This is exactly what I was talking about. Multiplying 'a' and 'b' do not produce the same 'c'. It's not cblas_dgemm problem. I think the call to cblas_dgemm is correct. The order of arguments is correct. The problem is 'a' and 'b'. You need to check why your 'a' and 'b' do not produce the 'c' you expect.

0 Kudos
vincent_ferri
Beginner
1,090 Views

Hi,

Do you get a 10 X 10 matrix or 4 X 4 it should be 10 X 10 since the product is b [10X4] * a[4 X 10].

Regards,

0 Kudos
Zhang_Z_Intel
Employee
1,090 Views

I copy/paste exactly the cblas_dgemm call you gave in your post. The result is a 10x10 matrix. But it is different than your reference matrix (the one you gave in your earlier post).

0 Kudos
vincent_ferri
Beginner
1,090 Views

Hi,

I apoligize for going back and forth, but with the given c_vector file that I attached with vector 'a' and 'b' if you multiply them you do not get vetor 'c' the one in the file.  In Matlab b * a = c and cblas_dgemm also gives me the same 'c'. Here is my snippet;

 double  alpha = 1.0;
 double  beta = 1.0;
 int iVandSize = 10;

int ldA = iVandSize;
 int ldB = iVandSize;
 int ldC = iVandSize;
 int iStrideB = 4;

 //C = aphla*A*B + beta*C
 cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, iVandSize,ldB, iStrideB, beta , b, iStrideB, a, ldA, alpha, c, ldC); 

Regards,

P.S is it possible to show me your out put from this call.

0 Kudos
mecej4
Honored Contributor III
994 Views

I apoligize for going back and forth, but with the given c_vector file that I attached with vector 'a' and 'b' if you multiply them you do not get vetor 'c' the one in the file.

I have read this thread with increasing dismay. The use of misleading terms such as "c vector", illogical statements such as this quotation (in mathematics, the product of two vectors is either a scalar -- inner product-- or a matrix -- outer product) makes for much confusion.

Add to that apparent changes in topic from one post to another within the same thread, and we have a thread that should be quarantined, and a new thread opened with some attention to clarity and precision in problem statement.

0 Kudos
Reply