Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
- Problem applying dsyevr to a large matrix

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

mspecian

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-13-2010
02:29 PM

145 Views

Problem applying dsyevr to a large matrix

While I am able to successfully diagonalize small matrices (the largest I've test is a 10 by 10), diagonalizing my large matrix has proven challenging. My code compiles and even seems to run fine, but after dsyevr has finished executing the parameter "info" is greater than 1, indicating that there was some kind of "internal error".

Does anyone have any idea what might cause this internal error? There is no explicit segmentation fault. The code fully completes except the arrays with eigenvalues and eigenvectors come back empty.

In case somebody thinks this may be a code related issue, I've included it below. The first is the C code that allocates memory, reads in files, etc. The next is the FORTRAN subroutine called by this C code. Finally, I include the linker that puts it all together.

////////////////////////////////////////////////////////////////////////////////////////////

// This routine calculates the eigenvalues and eigenvectors of the symmetric matrix //

// of doubles fwritten to the file "fnameANM" using the dsyevr LAPACK procedure. Since //

// memory constraints restrict the creation of a huge, complete eigenvector file, only //

// the il-th through iu-th eigenvalues and eigenvectors are returned where 1<=il<= //

// iu<=nrows. The eigenvalues are stored from most positive to most negative. This //

// routine is somewhat faster than the packed version. //

////////////////////////////////////////////////////////////////////////////////////////////

void GetEigenvectors(char *fnameANM, long nrows, long il, long iu) {

double *w, *z, *work, rnrows, working, *EArray;

size_t size, ii, ee;

long info, lz, lzp, lisuppz, lwork, m, *iwork, liwork, iil, iiu, cc, rr;

info = NULL;

printf("\\nThis routine returns eigenmode #%d (larger eigenvalue) through %d (smaller eigenvalue).\\n\\n",il,iu);

//dsyevr puts the largest Evl's Evc last, so this fixes that

long swap = iu;

iu = nrows - il + 1;

il = nrows - swap + 1;

lz = iu-il+1; //Number of eigenvectors to be returned.

lzp = lz; //Same variable, but this doesn't get overwritten.

lisuppz = 4*lz; //Size of the iwork array

lwork = 50*nrows; //Size of work array

liwork= 40*nrows; //Size of lwork array

//////////////////// In case of seg fault, try upping the size of these arrays.

rnrows = nrows;

working = rnrows*rnrows;

size = floor(working+0.0001);

double *a = freadDMatrix(fnameANM,size);

printf("The final element of the average noise matrix is %f.\\n",a[size-1]);

w = (double *)malloc(nrows*sizeof(double));

printf("Regarding memory allocation for w: "); ErrnoDescription(errno);

z = (double *)malloc(nrows*lz*sizeof(double));

printf("Regarding memory allocation for z: "); ErrnoDescription(errno);

work = (double *)malloc(lwork*sizeof(double));

printf("Regarding memory allocation for work: "); ErrnoDescription(errno);

iwork = (long *)malloc(liwork*sizeof(long));

printf("Regarding memory allocation for iwork: "); ErrnoDescription(errno);

lwork = -1; liwork = -1; //Start with -1 to find optimal values.

printf("\\n\\nUsing dsyevr to determine optimal sizes of lwork and iwork.\\n");

dsyevrf_(&nrows, a, &il, &iu, &m, w, z, &lz, &lisuppz, work, &lwork, iwork, &liwork, &info);

if(info==0) {

printf("Correct values of iwork and liwork found successfully after %d seconds.\\n\\n", tEnd-tBegin);

lwork = floor(work[0]+0.000001); printf("minimum size of lwork = %d\\n",lwork);

liwork = iwork[0]; printf("minimum size of liwork = %d\\n",liwork);

free(work); free(iwork);

work = (double *)malloc(lwork*sizeof(double));

printf("Regarding memory allocation for work: "); ErrnoDescription(errno);

iwork = (long *)malloc(liwork*sizeof(long));

printf("Regarding memory allocation for iwork: "); ErrnoDescription(errno);

}

if(info<0) printf("Parameter %d had an illegal value.\\n",(-1*info));

////////////// THIS IS THE REAL EXECUTION OF THE EIGENMODE FINDER //////////////

printf("\\nExecuting dsyevrf.\\n");

dsyevrf_(&nrows, a, &il, &iu, &m, w, z, &lz, &lisuppz, work, &lwork, iwork, &liwork, &info);

tEnd = time(tloc);

printf("Exited dsyevrf.\\n\\n");

if(info<0) printf("Parameter %d had an illegal value.\\n",(-1*info));

if(info>1) printf("AN INTERNAL ERROR HAS OCCURRED\\nAN INTERNAL ERROR HAS OCCURRED\\n\\n");

}

void dspevxf_(long long *nrows, double *ap, long long *trisize, long long *il, long long *iu, long long *m, double *w, double *z, long *lz, double *work, long long *lwork, long long *iwork, long long *liwork, long

int main ()

{

long long ncells6 = 66113;

///////////////////////////////////////////////////////////////////////////////////////

///////////////////////////////// GetEigenvectors /////////////////////////////////////

///////////////////////////////////////////////////////////////////////////////////////

char *fnameGCorr_sy = "/home/mj/PSnoise/noiseMat_Vec/GCorMat_R6sy_fwrite";

long il_ge = 1;

long iu_ge = 10000;

/*GetEigenvectors[1. Name of symmetric, fwritten ncells by ncells covariance function

to be diagonalized

2. Dimension of above matrix, also # of cells

3. Value >= 1 equaling # of first eigenvector returned (1 is largest)

4. Value >= il and <= ncells equaling # of final eigenvector returned] */

GetEigenvectors(fnameGCorr_sy, ncells6, il_ge, iu_ge);

c********************************************************************

subroutine dspevxf(n,ap,lap,il,iu,m,w,z,lz,work,lwork,iwork,

+liwork,ifail,info)

c--------------------------------------------------------------------

c Calculates selected eigenvalues and eigenvectors of a

c symmetric matrix T triangularly packed into ap.

c

c Based upon the dspevx routine.

c

c This routine is called from the C function GetEigenvectorsPacked

c

c 'N' should be used if only eigenvalues are desired.

c 'V' should be used if both eigenvalues and eigenvalues are

c desired.

c For the second parameter, 'A' means all eigenvalues will be

c found. 'V' means all eigenvalues in the half-open interval

c (vl,vu] will be found. 'I' mean the eigenvalues with indices

c il through iu will be found.

c For the third parameter, 'U' means the symmetric matrix is

c stored in the upper triangle of a and 'L' means its stored in

c the lower triangle.

c "n" is the number of rows (columns) in the symmetric matrix

c "ap" is the nXn array packed into a size "lap" = n(n+1)/2 housing

c the symmetric matrix in its upper or lower triangle.

c "vl" and "vu" are the lower and upper bounds respectively of the

c interval to be searched for eigenvalues. "vl" <= "vu". These

c values are referenced only if the second parameter = 'V'

c "il" and "iu" are the indices of the smallest and largest eigenvalues

c to be returned such that 1<=il<=iu<=in. These values are referenced

c only if the second parameter = 'I'

c "abstol" is the absolute error tolerance for the eigenvalues.

c "z" is an output parameter. If 'N', this is not referenced. If

c 'V', then this will contain the "lz" eigenvectors specified by

c the range "il" through "iu".

c "m" is the total number of eigenvalues found.

c "w" is the array to which the first m eigenvalues are stored.

c "isuppz" contains, upon output, the indices indicating the non-zero

c elements in "z". It has length "lisuppz".

c "work" is a workspace array of length "lwork" = 8n

c "iwork" is a workspace array of length "liwork" = 5n

c "ifail" is an integer array of length "n" containing the indices of

c the eigenvectors that failed to converge or, if "info"=0, the first

c m elements of ifal are zero.

c "info" is an integer. If all is ok, a 0 is output. If i, the

c algorithm did not converge and i indicates the number of

c elements of an intermediate tridiagonal form which did not

c converge to zero. If -i, the ith parameter had an illegal

c value.

c

c Using LAPACK.

c

c

implicit none

integer*8 n,lap,il,iu,m,lz,lwork,liwork,iwork(liwork),ifail(n),

+info

real*8 ap(lap),vl,vu,abstol,dlamch,w(n),z(n,lz),work(lwork)

external dlamch

c

abstol = 2*dlamch('S')

print *, 'ap(final) = ', ap(lap)

print *, 'About to compute eigenvectors',il,'through',iu,'.'

print *, 'n = ',n

call dspevx('V','I','L',n,ap,vl,vu,il,iu,abstol,m,w,z,n,

+work,iwork,ifail,info)

c

return

end

c

c

.f.o:

$(F77) -c $<

.c.o:

$(CC) -c $<

default: all

CC = gcc

F77 = ifort

CL = ifort -nofor_main

CL8= ifort -nofor_main -i8

INCF = -I/usr/site/intel_fortran/include #FORTRAN INCLUDE FILES

IMKLF = -I/usr/site/intel_fortran/mkl/include #FORTRAN MKL FILES

INCC = -I/usr/site/intel_cpp/include #C++ INCLUDE FILES

IMKLC = -I/usr/site/intel_cpp/mkl/include #C++ MKL FILES

LMKL = -Vaxlib -L/usr/site/intel_fortran/mkl/lib/em64t -lfftw3 -lmkl_core -lmkl_sequential -lmkl_intel_ilp64

eigen: LAPACK_dspevx_f.oEigenFind.o

$(CL8) $^ $(LMKL) -o $@.o

Link Copied

13 Replies

Artem_V_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-13-2010
11:56 PM

145 Views

Could you please submit your test case as an attachment to your post? Looks like code your posted has been broken by the text editor.

Could you also specify what "info" value did you obtain on exit of dsyevr() function?

Best regards,

Artem

mspecian

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-14-2010
07:36 AM

145 Views

I do not have the info parameter readily available. Until this error occurred I had not written a line to return it explicitly. I have another version of the diagonalization code running for packed storage with dspevx. If this also fails, I will have the value of info returned. However, these routines take a while to run...over two days in fact, so it's not a quick matter to simply recover the parameter.

I'm not aware of whether it is possible to parallelize this routine to run over multiple processors. Would you happen to know? Thank you very much for your generous attention.

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-14-2010
07:59 AM

145 Views

>>> for packed storage with dspevx...

2)Do you meandspevx executed > then 2 days!. What was the order of the input matrixes?

3)The list of threaded functions, you can find into User's Guide, chapter 6,Using the Intel MKL Parallelism.

--Gennady

mspecian

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-14-2010
09:03 AM

145 Views

2) dsyevr took about 2 days 6 hours to execute on my machine. The matrix is 66113 squared.

3) Thanks for the reference. I had seen this before but it's good to get outside verification.

Michael_C_Intel4

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-15-2010
01:01 AM

145 Views

dsyevr calls dsytrd, which is threaded.

Did you try other eigensolvers such as dsyev, dsyevd, dsyevx?

What is the typical execution time for less size problem, say 10000, 20000, 40000 ?

Michael.

Michael_C_Intel4

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-15-2010
01:07 AM

145 Views

Michael.

yuriisig

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-15-2010
08:52 AM

145 Views

mspecian

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-15-2010
09:29 AM

145 Views

Hi Michael,

Thanks for replying. No, I have not tried the other eigensolvers yet. As I wrote above, I'm running through a sequential operation of dspevx right now, but I will certainly give those other routines a look. I have yet to do an execution time test for matrices of the size you mentioned. Once dsyevr failed, I've prioritized trying the other algorithms.

To your second point about sequential versus threading, I wasn't acutely aware of the dsyevr - dsytrd connection. I can try the threaded version as well to see if it affects execution time.

I was wondering if you had a response to the comment below about the dsyevr algorithm being unreliable in general.

mspecian

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-15-2010
09:39 AM

145 Views

I should also mention that I am not concerned with all the eigenvectors. I'm using a version of Principal Component Analysis and need only the first chunk of eigenvectors (those associated with the largest positive eigenvalues) to be accurate. Can you comment on whether dsyevr can handle this task?

You mention that you don't recommend dsyevr. What do you recommend instead? Is your lack of recommendation due to accuracy errors, or unreliability? I am still trying to identify why an internal error in this routine is occurring.

yuriisig

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-15-2010
10:22 AM

145 Views

Quoting mspecian

I wrote that is not paralleled RRR algorithm, which is included in dsyevr: http://software.intel.com/en-us/forums/showthread.php?t=73653(Unfortunately, now my page is not available). Intel MKL - very good package, but for your problem, he is not well suited. Most likely an internal error occurs because the RRR algorithm. Most time is spent on bringing to tridiagonal form for BLAS level 2. The rest you can quickly calculate, given that you do not need all the eigenvectors. Once again I say that dsyevr because RRR algorithm inaccurate and unreliable.

mspecian

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-16-2010
06:45 PM

145 Views

yuriisig

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-16-2010
11:34 PM

145 Views

The algorithms work with packed matrices in the Intel MKL is much slower than with square matrices.

yuriisig

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-17-2010
07:50 AM

145 Views

Quoting mspecian

Propose to act as follows:

1. dsytrd

2. dstedc (To program this feature, it took 10 years.)This algorithm requires a lot of RAM, but fast and reliable.

3. dormtr (Apply this feature only to those eigenvectors tridiagonal matrix that you want).

For more complete information about compiler optimizations, see our Optimization Notice.