Static vs. dynamic vector input

vincent_ferri · ‎04-16-2013

Hi,

I am having an issue with dynamic input arguements to the following functions;

LAPACKE_dgelss();

dgetrf();dgetri();

ippmInvert_m_64f();

When the input vector is static A[10 * 10] = {...}; the output is correct namely A inverse. If the same data values are read into a dynamic vector the output is incorrect. Why am I getting this anomally?

Thanks

Vince

Zhang_Z_Intel · ‎04-16-2013

With the very limited information you provided, it's hard to reproduce the problem or guess an explanation. Would you provide a small reproducer? Or, at least show a code snippet calling these functions using static and dynamic arrays? Thanks.

vincent_ferri · ‎04-16-2013

Hi,

I will concentrate on one of the functions that I listed, it is used for LU factorization;

lapack_int info;
MKL_INT* ipiv;
double dVandSize = 10;

double* c = ( double * ) malloc ( dVandSize * dVandSize * sizeof ( double ) );

//If input vector c[10 * 10] is a static array with initialized values the function works, if c is dynamic and contains the same values it doesn't work; I included a file called c_vector that contains c;

ipiv = ( MKL_INT * ) malloc ( dVandSize * sizeof ( MKL_INT ) );
dgetrf(&dVandSize,&dVandSize,c,&dVandSize,ipiv,&info); //Computes the LU factorization

double* workspace = new double [dVandSize* sizeof(double)];

dgetri(&dVandSize, c, &dVandSize, ipiv, workspace, &dVandSize, &info);

Thanks

Zhang_Z_Intel · ‎04-16-2013

Well, I was not able to reproduce the problem. Both static and dynamic arrays worked fine and gave identical results. See my test code attached.

But a careful look at your code snippet revealed this problem:

[cpp]double dVandSize = 10;[/cpp]

Why was this variable declared as double when it should be an integer? Didn't you get compiler warnings?

SergeyKostrov · ‎04-16-2013

>>... I was not able to reproduce the problem... There are differences in initializations and take a look: [ This is how Vincent initializes ] ... double *c = ( double * )malloc( dVandSize * dVandSize * sizeof ( double ) ); ... ipiv = ( MKL_INT * )malloc( dVandSize * sizeof ( MKL_INT ) ); ... double *workspace = new double [ dVandSize * sizeof( double ) ]; // Note: C++ operator new is used ... [ This is how Zhang initializes ] ... double *c = ( double * )malloc( dVandSize * dVandSize * sizeof( double ) ); ... ipiv = ( MKL_INT * )malloc( dVandSize * sizeof( MKL_INT ) ); ... double *workspace = ( double * )malloc( dVandSize * sizeof( double ) ); // Note: CRT-function malloc is used ... Vincent, my question is Why do you need sizeof( double ) in new double [ dVandSize * sizeof( double ) ]?

SergeyKostrov · ‎04-17-2013

Results are absolutely identical and please take a look: [ Output when CRT-function 'malloc' is used ] Intel(R) Math Kernel Library Version 10.3.12 Product Build 20120831 for 32-bit applications Major version : 10 Minor version : 3 Update version : 12 Product status : Product Build : 20120831 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 [ Output when C++ operator 'new' is used ] Intel(R) Math Kernel Library Version 10.3.12 Product Build 20120831 for 32-bit applications Major version : 10 Minor version : 3 Update version : 12 Product status : Product Build : 20120831 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186

SergeyKostrov · ‎04-17-2013

// Sub-Test 1 - Gets MKL version { ///* MKLVersion Ver = { 0x0 }; int iLenData = 256; char szVerData[256] = { 0x0 }; MKL_Get_Version_String( szVerData, iLenData ); CrtPrintfA( "\n%s\n", szVerData ); MKL_Get_Version( &Ver ); printf( "Major version : %d\n", Ver.MajorVersion ); printf( "Minor version : %d\n", Ver.MinorVersion ); printf( "Update version : %d\n", Ver.UpdateVersion ); printf( "Product status : %s\n", Ver.ProductStatus ); printf( "Build : %s\n", Ver.Build ); printf( "\n" ); //*/ } // Sub-Test 2 - Test for dgetrf and dgetri functions { ///* double data[] = { +4.00e+000, +1.50e+001 , +4.00e+001 , +8.50e+001 , +1.56e+002 , +2.59e+002, +4.00e+002 , +5.85e+002 , +8.20e+002, +1.11e+003, +1.50e+001, +8.50e+001, +2.59e+002 , +5.85e+002 , +1.11e+003 , +1.89e+003 , +2.96e+003 , +4.37e+003 , +6.18e+003 , +8.42e+003, +4.00e+001, +2.59e+002 , +8.20e+002 , +1.89e+003 , +3.62e+003 , +6.18e+003 , +9.72e+003 , +1.44e+004 , +2.04e+004 , +2.79e+004, +8.50e+001, +5.85e+002 , +1.89e+003 , +4.37e+003 , +8.42e+003 , +1.44e+004 , +2.28e+004 , +3.38e+004 , +4.80e+004 , +6.56e+004, +1.56e+002 , +1.11e+003 , +3.62e+003 , +8.42e+003 , +1.63e+004 , +2.79e+004 , +4.41e+004, +6.56e+004 , +9.32e+004, +1.28e+005, +2.59e+002, +1.89e+003, +6.18e+003 , +1.44e+004 ,+2.79e+004 ,+4.80e+004 , +7.59e+004, +1.13e+005 , +1.60e+005 , +2.20e+005, +4.00e+002 , +2.96e+003 , +9.72e+003 , +2.28e+004 , +4.41e+004 , +7.59e+004 , +1.20e+005 , +1.79e+005 , +2.54e+005 , +3.48e+005, +5.85e+002 , +4.37e+003 , +1.44e+004 , +3.38e+004 , +6.56e+004 , +1.13e+005 , +1.79e+005 , +2.66e+005 , +3.79e+005 , +5.18e+005, +8.20e+002 , +6.18e+003 , +2.04e+004 , +4.80e+004 , +9.32e+004 , +1.60e+005 , +2.54e+005 , +3.79e+005 , +5.38e+005 , +7.37e+005, +1.11e+003, +8.42e+003 , +2.79e+004 ,+6.56e+004 ,+1.28e+005 ,+2.20e+005 , +3.48e+005 , +5.18e+005 , +7.37e+005 , +1.01e+006 }; lapack_int info = 0; MKL_INT *ipiv = NULL; MKL_INT dVandSize = 10; MKL_INT i; // double *c = data; // double *c = ( double * )malloc( dVandSize * dVandSize * sizeof( double ) ); double *c = ( double * )new double[ dVandSize * dVandSize ]; for( i = 0; i < dVandSize * dVandSize; i++ ) { c = data; } // ipiv = ( MKL_INT * )malloc( dVandSize * sizeof( MKL_INT ) ); ipiv = ( MKL_INT * )new MKL_INT[ dVandSize ]; dgetrf( &dVandSize, &dVandSize, c, &dVandSize, ipiv, &info ); if( info != 0 ) { printf( "DGETRF INFO: %d\n", info ); exit( 1 ); } // double *workspace = ( double * )malloc( dVandSize * sizeof( double ) ); double *workspace = ( double * )new double[ dVandSize ]; dgetri( &dVandSize, c, &dVandSize, ipiv, workspace, &dVandSize, &info ); if( info != 0 ) { printf( "DGETRF INFO: %d\n", info ); exit( 1 ); } for( i = 0; i < dVandSize * dVandSize; i++ ) { printf( "% lf ", c ); if( ( (i+1) % 10 ) == 0 ) printf( "\n" ); } // if( workspace != NULL ) // free( workspace ); // if( ipiv != NULL ) // free( ipiv ); // if( c != NULL ) // free( c ); if( workspace != NULL ) delete workspace; if( ipiv != NULL ) delete ipiv; if( c != NULL ) delete c; printf( "\n" ); //*/ }

vincent_ferri · ‎04-17-2013

But all you did was take a static vector and copy it to a dynamic vector, this works for me too. But how about using

cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, dVandSize,ldB, iStrideB, beta , b, iStrideB, a, ldA, alpha, c, ldC); as indicated in my last post, take this c vector and put it into dgetrf() dgetri( ).

Regards,

Vince

Zhang_Z_Intel · ‎04-17-2013

Vince,

What do you mean by "c vector"? How is it different than a staitc vector and a dynamic vector? And what does cblas_dgemm have to do with this? Instead of having all of us guessing what you want, it would be much easier to post your whole test code here, please?

By the way, have you got a chance to look at the issue pointed out by other replies on this post? Why is 'dVandSize' a double floating point variable? If you follow DGETRF and DGETRI signatures, this argument should be an integer. Have you tried to make it an integer? Does this solve the problem?

vincent_ferri · ‎04-17-2013

Hi

the c vector is the dyamic vector that you created, and cblas_dgemm () uses the a vector and b vector to produce the c vector and that is what you use for LU. I have given the 'a' and 'b' vectors in the file c_vector.txt

Regards,

Vince

Zhang_Z_Intel · ‎04-17-2013

I believe you certainly have taken care of this, and it's probably not related to your original question. But just in case ..., the matrix order in cblas_dgemm can be either row major or column major, but dgetrf and dgetri assume column major matrix order as they are FORTRAN routines.

I'll take another look at it and let you know.

vincent_ferri · ‎04-17-2013

int iVandSize = 10;

double alpha = 1.0;
double beta = 1.0;
int ldA = iVandSize;
int ldB = iVandSize;
int ldC = iVandSize;
int iStrideB = 4;

//cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, iVandSize,ldB, iStrideB, beta , b, iStrideB, a, ldA, alpha, c, ldC);

Thansk

SergeyKostrov · ‎04-17-2013

>>...Instead of having all of us guessing what you want, it would be much easier to post your whole test code here, please?.. Vincent, We're trying to help you and please provide as more as possible technical details, like complete codes ( not snippets ), MKL version / update, platform ( OS ), C/C++ compiler, command line options, IDE, etc. OK? Since I've already created my own test case I'll do another verification with the latest version of MKL ( 11 ) on a 64-bit Windows platform.

SergeyKostrov · ‎04-17-2013

Application - IccTestApp - WIN32_ICC - Debug Tests: Start > Test1153 Start < Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130123 for 32-bit applications Major version : 11 Minor version : 0 Update version : 2 Product status : Product Build : 20130123 Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) Enabled Processor 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 > Test1153 End < Tests: Completed // Application - IccTestApp - WIN32_ICC - Release Tests: Start > Test1153 Start < Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130123 for 32-bit applications Major version : 11 Minor version : 0 Update version : 2 Product status : Product Build : 20130123 Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) Enabled Processor 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 > Test1153 End < Tests: Completed // Application - IccTestApp - WIN32_ICC - Debug Tests: Start > Test1153 Start < Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130124 for Intel(R) 64 architecture applications Major version : 11 Minor version : 0 Update version : 2 Product status : Product Build : 20130124 Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) Enabled Processor 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 > Test1153 End < Tests: Completed // Application - IccTestApp - WIN32_ICC - Release Tests: Start > Test1153 Start < Intel(R) Math Kernel Library Version 11.0.2 Product Build 20130124 for Intel(R) 64 architecture applications Major version : 11 Minor version : 0 Update version : 2 Product status : Product Build : 20130124 Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) Enabled Processor 2.962044 -1.945165 0.176090 0.226115 -0.074739 0.002508 0.048719 -0.008698 -0.023452 0.007123 -1.945165 1.673040 -0.089525 -0.313017 0.066992 0.008686 -0.063396 0.020223 0.028665 -0.008833 0.176090 -0.089525 -0.177685 0.130976 -0.004348 -0.005037 0.024478 -0.011186 -0.010619 0.003654 0.226115 -0.313017 0.130976 -0.007564 -0.002099 -0.000986 0.002504 0.003507 -0.000518 -0.002568 -0.074739 0.066992 -0.004348 -0.002099 0.001317 -0.002620 -0.003569 -0.002405 0.001578 0.001495 0.002508 0.008686 -0.005037 -0.000986 -0.002620 0.000504 0.002958 -0.000118 -0.001566 0.000534 0.048719 -0.063396 0.024478 0.002504 -0.003569 0.002958 -0.003076 0.001410 0.000353 -0.000477 -0.008698 0.020223 -0.011186 0.003507 -0.002405 -0.000118 0.001410 -0.001292 0.000163 0.000311 -0.023452 0.028665 -0.010619 -0.000518 0.001578 -0.001566 0.000353 0.000163 0.000035 0.000025 0.007123 -0.008833 0.003654 -0.002568 0.001495 0.000534 -0.000477 0.000311 0.000025 -0.000186 > Test1153 End < Tests: Completed

Zhang_Z_Intel · ‎04-17-2013

Vince,

The matrix produced by cblas_dgemm is very different than the original static matrix you provided. After cblas_dgemm, if you compare the result against the orignal static matrix, the root mean square error is more than 1.5e+02. Therefore, the inputs to the dgetrf call and the subsequent dgetri call are different, and different results are expected.

vincent_ferri · ‎04-18-2013

Hi,

When I use dgemm with vector 'a' and 'b' provided from the file you do not get 'c' with the same data as provided, that makes no sense becasue 'c' is a copy and paste from that function, the order of arguements are;

cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, iVandSize,ldB, iStrideB, beta , b, iStrideB, a, ldA, alpha, c, ldC);

Regards

Zhang_Z_Intel · ‎04-18-2013

vincent.ferri wrote:

Hi,

When I use dgemm with vector 'a' and 'b' provided from the file you do not get 'c' with the same data as provided,

This is exactly what I was talking about. Multiplying 'a' and 'b' do not produce the same 'c'. It's not cblas_dgemm problem. I think the call to cblas_dgemm is correct. The order of arguments is correct. The problem is 'a' and 'b'. You need to check why your 'a' and 'b' do not produce the 'c' you expect.

vincent_ferri · ‎04-18-2013

Hi,

Do you get a 10 X 10 matrix or 4 X 4 it should be 10 X 10 since the product is b [10X4] * a[4 X 10].

Regards,

Zhang_Z_Intel · ‎04-18-2013

I copy/paste exactly the cblas_dgemm call you gave in your post. The result is a 10x10 matrix. But it is different than your reference matrix (the one you gave in your earlier post).

vincent_ferri · ‎04-18-2013

Hi,

I apoligize for going back and forth, but with the given c_vector file that I attached with vector 'a' and 'b' if you multiply them you do not get vetor 'c' the one in the file. In Matlab b * a = c and cblas_dgemm also gives me the same 'c'. Here is my snippet;

double alpha = 1.0;
double beta = 1.0;
int iVandSize = 10;

int ldA = iVandSize;
int ldB = iVandSize;
int ldC = iVandSize;
int iStrideB = 4;

//C = aphla*A*B + beta*C
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, iVandSize,ldB, iStrideB, beta , b, iStrideB, a, ldA, alpha, c, ldC);

Regards,

P.S is it possible to show me your out put from this call.

mecej4 · ‎04-18-2013

I apoligize for going back and forth, but with the given c_vector file that I attached with vector 'a' and 'b' if you multiply them you do not get vetor 'c' the one in the file.

I have read this thread with increasing dismay. The use of misleading terms such as "c vector", illogical statements such as this quotation (in mathematics, the product of two vectors is either a scalar -- inner product-- or a matrix -- outer product) makes for much confusion.

Add to that apparent changes in topic from one post to another within the same thread, and we have a thread that should be quarantined, and a new thread opened with some attention to clarity and precision in problem statement.