I have tested using LAPACK

Henrik_A_ · ‎01-09-2015

I have a problem where I need to calculate a number of eigenvectors and a different number of eigenvalues. Instead of calling dsyevr twice I plan on calling dsytrd -> dstemr * 2 -> dormtr. (Or alternatively stebz / stein)

However, I have noticed unexpected behavior regarding the eigenvector parameter (using 11.0 update 5, from C).

If I use jobz = 'N', then the call to dstemr sets the first element of the eigenvector array to 0.0 even when only calculating eigenvalues, and crashes if no array is provided. The documentation states that this variable is not used in this case. Is it sufficient to pass a single double as a dummy parameter or does the function also set more values in this array? Also while the documentation states that ldz should be >= 1 in this case, the function fails unless it is >= N.

Secondly, If I use jobz = 'V', the documentation states

"Array z(ldz, *), the second dimension of z must be at least max(1, m).

If jobz = 'V', and info = 0, then the first m columns of z contain the orthonormal eigenvectors of the matrix T corresponding to the selected eigenvalues, with the i-th column of z holding the eigenvector associated with w(i). ".

However, when calling from C, ldz contains the number of columns in the array, and should be set to the number of eigenvalues, but the parameter validation requires ldz to be >= N, which means that if I want to calculate the first 10 eigenvalues of a 1000 x 1000 matrix, I still would need to allocate the full size matrix. Am I missing something? Is this just due to LAPACK_ROW_MAJOR?

Henrik_A_ · ‎01-09-2015

I have tested using LAPACK_ROW_MAJOR in the call, then I do not need to pass a square matrix for the selected eigenvectors to the dstemr call.

One additional question:

According to the MKL manual, when requesting an incomplete set of eigenvectors, dsyevr calls dsytrd and then dstebz and dstein, and only calls dstemr when requesting the full set of vectors. However my tests appear to show that the running the dstemr calculation for incomplete sets which contain at least 10% of the eigenvectors is significantly faster. Is there a particular reason not to use this function instead when calculating a significant portion of the eigenvectors.

mecej4 · ‎01-11-2015

You have left out many details, especially argument values other than the ones that you mentioned. Without an example code from you, the number of possible combinations is too many to handle, and the only person who could answer your questions based solely on the two posts above would probably be the author of the MKL routines concerned.

MKL does not appear to provide an example code that uses ?stemr. Here is a Fortran example taken from an older thread in this forum (https://software.intel.com/en-us/forums/topic/293234), which I modified slightly to make it fit your description in the first part of the third paragraph of #1. The arguments vl and vu are scalars, and the example shows that the values passed in are not overwritten by the dstemr routine, which is not in agreement with what you stated -- assuming that calling from C versus calling from Fortran is not a major contributor to the problem.

program test_stemr

implicit none
integer i,j,k,il,iu
integer size,info,nev,nzc,lwork,liwork

real(kind=8) vl,vu,abstol
real(kind=8), allocatable :: D(:),E(:),Lambda(:),Phi(:,:),work(:)
integer, allocatable :: iwork(:),isuppz(:)
logical tryac

size = 100
write(*,*) "working on size = ",size
allocate(D(size),E(size))
do i=1,size
  D(i) = i
  E(i) = 1.0
end do
allocate(Lambda(size),Phi(size,size),isuppz(2*size))
liwork = 10*size
lwork = 18*size
allocate(iwork(liwork),work(lwork))

nzc = size
tryac = .true.
vl = 1.99d0
vu = 18.1d0
il = 1
iu = 2
call dstemr('N','I',size,D,E,vl,vu,il,iu,nev,Lambda,Phi,size,nzc, &
            isuppz,tryac,work,lwork,iwork,liwork,info)

write(*,*) 'after stemr'
write(*,*) 'info   = ',info
write(*,*) 'nzc    = ',nzc
write(*,*) 'nev    = ',nev
write(*,*) 'eigenvalues are: ',(lambda(i),i=1,nev)
write(*,*)
write(*,*) 'vl,vu = ',vl,vu

deallocate(work,iwork,isuppz,Lambda,Phi,D,E)

write(*,*) 'done with test_stemr'

end program test_stemr

Henrik_A_ · ‎01-12-2015

The first issue (the nullptr dereference) appears to only occur when using the 64bit version of MKL, not when using the 32bit version. I hadn't released this until creating this minimal example.

The second issue has to do with the C matrix ordering so I doubt it can be reproduced in fortran - using the default LAPACKE_COL_MAJOR the eigenvector matrix parameters are stride and number of columns, and stride is tested for >= matrix_size and number of columns is tested for >= number_eigenvalues. When using LAPACKE_ROW_MAJOR the stride still has to be >= matrix_size instead of >= number_eigenvalues.

#include <memory>
#include <mkl.h>


bool CalcTridiagonalForm(__in const size_t matrixSize,
							__in double *matrixData,
							__out double *d,
							__out double *e,
							__out double *tau)
{
	lapack_int mklRes = 0;

	//	This destroys the data array! We could save it in the unused portion of the triangle matrix
	mklRes = ::LAPACKE_dsytrd(	LAPACK_ROW_MAJOR,
								'U', // If uplo = 'U', a stores the upper triangular part of A.
								static_cast<MKL_INT>(matrixSize), // The number of rows of the matrix A
								const_cast<double*>(matrixData),// // an array containing the m-by-n matrix A. (XXX: The second dimension of a must be at least max(1, n) )
								static_cast<MKL_INT>(matrixSize), // The leading dimension of the array a. 
								d,
								e,
								tau
								);
	if (mklRes != 0)
	{
		return false;
	}

	return true;
}




bool CalcEigenValues_dstemr(	__in const size_t matrixSize,
								__in double *matrixData,
								__in const size_t numEigenValues,
								__out double *eigenValues,
								__out_opt double *eigenvectors	)
{
	lapack_int mklRes = 0;

	//	convert the matrix to tridiagonal form
	std::unique_ptr<double[]>	d(new double[matrixSize * 2]);
	std::unique_ptr<double[]>	e(new double[matrixSize * 2]);
	std::unique_ptr<double[]>	tau(new double[matrixSize * 2]);
	::memset(d.get(), 0, sizeof(double) * matrixSize * 2);
	::memset(e.get(), 0, sizeof(double) * matrixSize * 2);
	::memset(tau.get(), 0, sizeof(double) * matrixSize * 2);

	if (!CalcTridiagonalForm(matrixSize, matrixData, d.get(), e.get(), tau.get()))
		return false;

	MKL_INT nEigenvalues;
	std::unique_ptr<MKL_INT[]>	isuppz(new MKL_INT[2 * matrixSize]);
	if (isuppz == nullptr)
		return false;

	int tryrac = true;
	if (nullptr == eigenvectors)
	{
		std::unique_ptr<double[]>	tempEigenvalues(new double[matrixSize]);
		::memset(tempEigenvalues.get(), 0, sizeof(double) * matrixSize);


		//	Try calculating only eigenvalues:

		//	THIS CRASHES IN x64 MODE

		mklRes = ::LAPACKE_dstemr(	LAPACK_ROW_MAJOR,
									'N', // If jobz = 'N', then only eigenvalues are computed.
									'I', // If range = 'I', the routine computes eigenvalues with indices il to iu.
									static_cast<MKL_INT>(matrixSize), // The number of rows of the matrix A
									const_cast<double*>(d.get()),
									const_cast<double*>(e.get()),
									0, 0, 
									static_cast<MKL_INT>(matrixSize - numEigenValues + 1),
									static_cast<MKL_INT>(matrixSize), //	Fortran convention and inclusive. 1 to nCols
									&nEigenvalues,
									tempEigenvalues.get(),
									nullptr,
									static_cast<MKL_INT>(matrixSize), // This MUST equal at least the size of the matrix, no idea why
									0,
									nullptr,
									&tryrac
									);

		/*
		//	THIS WORKS
		double fakeEigenVector;
		mklRes = ::LAPACKE_dstemr(	LAPACK_ROW_MAJOR,
									'N', // If jobz = 'N', then only eigenvalues are computed.
									'I', // If range = 'I', the routine computes eigenvalues with indices il to iu.
									static_cast<MKL_INT>(matrixSize), // The number of rows of the matrix A
									const_cast<double*>(d.get()),
									const_cast<double*>(e.get()),
									0, 0, 
									static_cast<MKL_INT>(matrixSize - numEigenValues + 1),
									static_cast<MKL_INT>(matrixSize), //	Fortran convention and inclusive. 1 to nCols
									&nEigenvalues,
									tempEigenvalues.get(),
									&fakeEigenVector,//tempEigenvector.get(),
									static_cast<MKL_INT>(matrixSize), // This MUST equal at least the size of the matrix, no idea why
									0,
									nullptr,
									&tryrac
									);
									*/
		if (mklRes != 0)
		{
			return false;
		}

		::memcpy(eigenValues, tempEigenvalues.get(), sizeof(double) * numEigenValues);
	}
	else
	{
		std::unique_ptr<double[]>	tempEigenvalues(new double[matrixSize*2]);
		::memset(tempEigenvalues.get(), 0, sizeof(double) * matrixSize*2);

		::memset(eigenvectors, 0xCC, sizeof(double) * matrixSize * matrixSize);

		//	THIS REQUIRES A SQUARE MATRIX
		mklRes = ::LAPACKE_dstemr(	LAPACK_ROW_MAJOR,
									'V', // If jobz = 'V', then eigenvalues and eigenvectors are computed.
									'I', // If range = 'I', the routine computes eigenvalues with indices il to iu.
									static_cast<MKL_INT>(matrixSize), // The number of rows of the matrix A
									const_cast<double*>(d.get()),
									const_cast<double*>(e.get()),
									0, 0, 
									static_cast<MKL_INT>(matrixSize - numEigenValues + 1),
									static_cast<MKL_INT>(matrixSize), //	Fortran convention and inclusive. 1 to nCols
									&nEigenvalues,
									tempEigenvalues.get(),
									eigenvectors,
									static_cast<MKL_INT>(matrixSize), // numEigenValues should be used here but the call returns invalid parameter
									numEigenValues,
									isuppz.get(),
									&tryrac
									);

/*
		//	THIS WORKS
		mklRes = ::LAPACKE_dstemr(	LAPACK_COL_MAJOR,
									'V', // If jobz = 'V', then eigenvalues and eigenvectors are computed.
									'I', // If range = 'I', the routine computes eigenvalues with indices il to iu.
									static_cast<MKL_INT>(matrixSize), // The number of rows of the matrix A
									const_cast<double*>(d.get()),
									const_cast<double*>(e.get()),
									0, 0, 
									static_cast<MKL_INT>(matrixSize - numEigenValues + 1),
									static_cast<MKL_INT>(matrixSize), //	Fortran convention and inclusive. 1 to nCols
									&nEigenvalues,
									tempEigenvalues.get(),
									eigenvectors,
									static_cast<MKL_INT>(matrixSize),
									numEigenValues,
									isuppz.get(),
									&tryrac
									);
*/
		if (mklRes != 0)
		{
			return false;
		}

		::memcpy(eigenValues, tempEigenvalues.get(), sizeof(double) * numEigenValues);
		tempEigenvalues = nullptr;
	}

	d = nullptr;
	e = nullptr;
	tau = nullptr;

	return true;
}



int _tmain(int argc, _TCHAR* argv[])
{
	unsigned long matrixSize =  100;
	double *matrixData = new double[matrixSize * matrixSize];

	::srand(42);
	for (unsigned long i=0;i<matrixSize;++i)
	{
		for (unsigned long j=i;j<matrixSize;++j)
		{
			matrixData[i * matrixSize + j] = (rand() / (double)RAND_MAX);
		}
	}

	//	Set diagonal to zero to simplify restoring
	for (unsigned long i=0;i<matrixSize;++i)
	{
		matrixData[i * matrixSize + i] = 0.0;
	}

	//	Mirror the matrix
	for (unsigned long i=0;i<matrixSize;++i)
	{
		for (unsigned long j=i;j<matrixSize;++j)
		{
			matrixData[j * matrixSize + i] = matrixData[i * matrixSize + j];
		}
	}

	double *eigenValues = new double[matrixSize];
	if (!CalcEigenValues_dstemr(matrixSize, matrixData, matrixSize, eigenValues, nullptr))
		return 1;

	//	Restore matrix
	for (unsigned long i=0;i<matrixSize;++i)
	{
		matrixData[i * matrixSize + i] = 0.0;
		for (unsigned long j=i;j<matrixSize;++j)
		{
			matrixData[i * matrixSize + j] = matrixData[j * matrixSize + i];
		}
	}

	double *eigenVectors = new double[matrixSize * matrixSize];
	if (!CalcEigenValues_dstemr(matrixSize, matrixData, matrixSize / 4, eigenValues, eigenVectors))
		return 1;

	return 0;
}

Parameters for ?stemr