?syrdb parameters

Dimitris_Dakopoulos · ‎06-20-2013

Hello all,

Documentation of ?syrdb (dsyrdb in my case) mentions that kd ≥ 1.

In my code I follow the rule kd = 60 if order of matrix is > 2000 and 40 otherwise.

In trivial case of a 1x1 matrix (order = 1) and I get ***glibc detected. For higher rank matrices I don't get an error but valgrind memcheck shows multiple read/write warnings.

Any idea is welcomed.

Best

mecej4 · ‎06-20-2013

Valgrind is not useful in combination with Intel Fortran and MKL; it gives many false positives.

Please present a complete test case (code, compiler options, input data) where you see runtime errors.

SergeyKostrov · ‎06-21-2013

>>...In trivial case of a 1x1 matrix (order = 1) and I get ***glibc detected... Is that an error message? Also, what version of MKL do you use and what is your platform ( Windows or Linux; 32-bit or 64-bit )?

Dimitris_Dakopoulos · ‎06-25-2013

Thank you for your replies.

System:

Linux CentOS 4
gcc 4.7.2
MKL 11.0.0.079 using ILP64

Here is the code that fails:

[cpp]

#include <algorithm>
#include <mkl.h>

typedef long long IntType;

bool call_dsyrdb(double *a, IntType* n)
{
    IntType size = std::max((IntType) 1, *n);
    IntType kd = (*n > 2000) ? 64 : 40;
    IntType info = 0;
    IntType lwork = -1;

    double *d = new double[size];
    double *e = new double[size];
    double *tau = new double[size];
    double *z = new double[size];

    // query workspace
    double tmp_work = 0.;
    dsyrdb_("N", "L", n, &kd, a, n, d, e, tau, z, n, &tmp_work, &lwork, &info);
    if(info)
    {
        return false;
    }
    lwork = (IntType) tmp_work;
    double *work = new double[lwork];

    // make tridiagonal
    dsyrdb_("N", "L", n, &kd, a, n, d, e, tau, z, n, work, &lwork, &info);

    // deallocate
    delete [] z;
    delete [] tau;
    delete [] work;
    if(info)
    {
        return false;
    }
    return true;
}

int main()
{
    IntType n = 1;
    double *a = new double(n * n);

    call_dsyrdb(a, &n);

    delete [] a;

    return 0;
}

[/cpp]

The error I get is:

*** glibc detected *** free(): invalid next size (fast): 0x0000000000bd1010 ***

SergeyKostrov · ‎06-25-2013

>>...delete [] a; Try ... if( a != NULL ) delete a; ... instead.

mecej4 · ‎06-25-2013

I think that this is not correct: [cpp]double *a = new double(n * n);[/cpp]. It does not create an n X n array but a single variable of type double. Later you make calls to MKL routines that expect n to contain at least n² elements, and illegal memory accesses result.

Secondly, the routine that you call expects the input matrix to be symmetric. Don't expect the program to work if you pass an uninitialized array instead, because a matrix containing garbage is unlikely to be reducible to symmetric tridiagonal form by orthogonal transformations.

SergeyKostrov · ‎06-25-2013

>>... It does not create an n X n array I agree and it looks like a typing error. If Yes it should look like: ... double *a = new double[ n * n ]; call_dsyrdb( a, &n ); delete [] a; ... and if initialization for the 2-D array a is needed it could be done as follows: ... double *a = new( 0.0L ) double[ n * n ]; call_dsyrdb( a, &n ); delete [] a; ...

Dimitris_Dakopoulos · ‎06-26-2013

Thank you all.

Sergey:

The correct way for deallocating arrays is with

[cpp]

delete [] d;

[/cpp]

Nevertheless, I did try your idea without success.

You were right, there was a typo.

[cpp]

double *a = new double[n * n];

[/cpp]

As far as I know, MKL accepts matrices in 1D storage (column or row-major), so there is no need to create 2D arrays.

Mecej4: The corrected with brackets initialization creates an 1D arrays with nxn elements which should be sufficients.

You were right that elements were not initialized but even if I do:

[cpp]

    double *a = new double[n * n];
    for(IntType i = 0; i < n * n; ++i)
    {
        a = 0.;
    }

[/cpp]

... there is no success. Personally, I believe that even unitialized the routine should not crash. Also, not that for n=100 there is no problem.

Here is what the gdb backtrace:

Using host libthread_db library "/lib64/tls/libthread_db.so.1".
*** glibc detected *** free(): invalid next size (fast): 0x0000000000bd1010 ***

Program received signal SIGABRT, Aborted.
0x000000327582e26d in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0 0x000000327582e26d in raise () from /lib64/tls/libc.so.6
#1 0x000000327582fa6e in abort () from /lib64/tls/libc.so.6
#2 0x00000032758635f1 in __libc_message () from /lib64/tls/libc.so.6
#3 0x00000032758691fe in _int_free () from /lib64/tls/libc.so.6
#4 0x0000003275869596 in free () from /lib64/tls/libc.so.6
#5 0x00000032783ae29e in operator delete(void*) () from /usr/lib64/libstdc++.so.6
#6 0x0000000000433e4f in main () at /users/a_user/dsyrdb/main.cpp:53

SergeyKostrov · ‎06-26-2013

I will do a verification on 64-bit Windows 7 Professional and MKL version 11.0.2 will be used: [ mkl.h ] ... #define __INTEL_MKL_BUILD_DATE 20130123 #define __INTEL_MKL__ 11 #define __INTEL_MKL_MINOR__ 0 #define __INTEL_MKL_UPDATE__ 2 ... I won't be able to verify your test case on a 64-bit Linux.

SergeyKostrov · ‎06-26-2013

>>...In trivial case of a 1x1 matrix (order = 1) and I get ***glibc detected.... I reproduced a similar memory corruption problem on Windows 7 platform. When the 2nd call to dsydb is commented out there are no any runtime problems and my output looks like: Start Memory allocated Order of the Matrix A=1 size=1 kd=40 lwork=-1 Query workspace Success tmp_work=123.000000 lwork=123 Memory deallocated Completed As soon as the 2nd call to dsydb is uncommented the test application craches as soon as main function exits.

Dimitris_Dakopoulos · ‎06-26-2013

So, it seems like there are two cases?

Bug in dsyrdb
Typo in MKL manual

mecej4 · ‎06-26-2013

Dimitris wrote:

So, it seems like there are two cases?

Bug in dsyrdb

Typo in MKL manual

Neither! It is more of a quality of implementation of issue. The MKL documentation gives a recommendation for the input argument kd that is valid for real life applications where the input matrix A is large. However, if A is small, those recommendations are incorrect. I believe that the MKL routine does not check for incorrect values of kd as well as it should, and you probably did not go beyond the MKL manual to find out what values are appropriate.

The argument kd is defined in the MKL 11 manual as "The bandwidth of the banded matrix B", but what is matrix B? It is certainly not one of the arguments to ?syrdb. If you look up the original paper (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.30.4242) and the TOMS issue in which the source code was given (http://www.netlib.org/toms/807) you find that kd is the half-bandwidth of the reduced bandwidth matrix A_band = Q^TAQ that results from applying blocked Householder transformations to the original matrix A. Therefore, kd can never exceed n - 1, if A is n X n.

Recommendations:

If the kd value that the MKL manual recommends is greater or equal to n, set kd = n - 1.

If you want to perform meaningful tests of matrix algebra routines, do not try with garbage matrices or null matrices. It is not that difficult to generate an input matrix with suitable properties (in this case, a nonsingular symmetric matrix of size larger than 40 X 40)

Avoid using routines such as ?syrdb that are labelled "computational". See if your needs could be met by using one of the "driver" routines instead. If you decide to use a "computational" routine directly, be prepared to work with a longer argument list which may include additional variables that are internal to the underlying algorithm, and be prepared to read technical papers in which the algorithm was published.

mecej4 · ‎06-26-2013

Dimitris wrote:

So, it seems like there are two cases?

Bug in dsyrdb

Typo in MKL manual

Neither. I posted a reply with references to original material which, I hope, will clarify the issues. Unfortunately, the very fact that I included URLs caused my message to be put in the queue for approval by a moderator. You will see it one of these days, I hope.

In the meantime, here is a short answer which you may consider as an action item: do not call ?SYRDB with a value of kd larger than n - 1, where the input matrix is n X n. If, for example, you use n = 3, make kd = 2. If you want meaningful output from ?SYRDB, use something better than a null matrix or an uninitialized matrix -- the former has a trivial answer, and the latter is unlikely to be symmetric, which is a requirement.

Although one may wish that MKL routines will fail gracefully even when confronted with wildly erroneous input, I don't think that they do, and we can understand why they don't. Adding a check on the value of kd, however, is probably a reasonable thing to do, given that reading the MKL manual alone does not lead to a realization of the restrictions on this argument, and that the allusions to matrix B in the manual are mystefying.

P.S. to MODERATOR: Now that you have released my earlier response (http://software.intel.com/en-us/comment/reply/394023/1741446?quote=1#comment-form), you may delete this post entirely.

Gennady_F_Intel · ‎06-26-2013

mecej, I will save your post (http://software.intel.com/en-us/comment/reply/394023/1741446?quote=1#com...) because of it contains very useful comments " If you want to perform meaningful tests of matrix algebra routines, do not try with garbage matrices or null matrices. It is not that difficult to generate a nonsingular symmetric matrix of size larger than 40 X 40."

this is pretty simple tips, but nevertheless, many users do that. :)

Gennady_F_Intel · ‎06-27-2013

We have checked the issue on our side - yes, we confirm that this is the problem. The input parameter for this routine didn't check properly.
The description would be updated accordingly. Thanks a lot for all who contributed to this discussion.
I will let you know when the fix of problem will be available.

SergeyKostrov · ‎06-27-2013

A modified test case I used on Ivy Bridge system with Windows 7 Professional is attached.

Gennady_F_Intel · ‎09-09-2013

FYI: the problem with 1x1 case has been fixed in the latest 11.1 version.