Re: mkl_dcsrtrsv

Anonymous70 · ‎07-12-2006

I'm having trouble with the mkl_dcsrtrsv(solve sparse triangular system) function. In particular, the function works fine for matrices of size less than 4000. But once the size of the matrix hits 4000, the function no longer returns the correct answer. I've written a quick function to test this out:

void TestDTRSV(int N, double p)
{
  std::vector A, b(N), x(N), y(N);
  std::vector ia, ja;
  char uplo('U'), transa('N'), diag('N');
	
  MakeRandomSparseMatrix(N, p, A, ia, ja);
  std::generate(b.begin(), b.end(), RandomGenerator());

  mkl_dcsrtrsv(&uplo, &transa, &diag, &N, &A[0], &ia[0], &ja[0], &b[0], &x[0]);
  mkl_dcsrgemv(&transa, &N, &A[0], &ia[0], &ja[0], &x[0], &y[0]);
  cblas_daxpy(N, -1, &b[0], 1, &y[0], 1);
  std::cout << cblas_dnrm2(N, &y[0], 1) << '
';
}

As written, this code works for N < 4000. By works, I mean that it prints out a norm residual less than 10^-8. But if I try N = 4000, the norm residual shoots up to something on the order of 10^3 or 10^4.

Strangely, if I change transa to 'T', this code works for every N that I've tried, including N >= 4000. Unfortunately, I need this function to work for both transa = 'T' and transa = 'N'.

Can anyone suggest a way to fix this or a workaround?

Thanks,

Bill

bill_woessner · ‎07-13-2006

A quick followup. I downloaded the beta of MKL 9.0. It did not fix this problem. But I am very excited that it includes an implementation of GMRES.

--Bill

Chao_Y_Intel · ‎07-17-2006

Hi, Bill,

It seems to be an issue. Could it be possible for you provide one standard alone test code? I can put this problem into our premier support website and our support engineer will investigate this problem.

Regards,

Chao

bill_woessner · ‎07-17-2006

Here is a complete C++ program to reproduce the bug. Itoutputs ||Ax - b|| where A is sparse and upper triangular and x is computed using mkl_dcsrtrsv. The entries of both A and b, as well as the sparisty pattern of A, are random. You can vary the size of the matrix by setting N in main().

My test platform is Visual C++ 7.1, but I've used all standard C++ so it should compile just about anywhere (with the possible exception of the random generator; that's why I encapsulate random number generation in the first place!).

#include 
#include 
#include 
#include 

#include "mkl_cblas.h"
#include "mkl_spblas.h"

struct RandomGenerator
{
	double operator()()
	{
		return(2 * GetDouble() - 1);
	}

	static double GetDouble()
	{
		static const double skdRandMax(RAND_MAX + 1);

		return(std::rand() / skdRandMax);
	}

	static void Seed()
	{
		std::srand(static_cast(std::time(0)));
	}
};

void MakeRandomSparseMatrix(int N, double p, std::vector& A, std::vector& ia, std::vector& ja)
{
	int i, j, ENNZ(p * N * N);
	RandomGenerator r;

	A.clear();
	A.reserve(ENNZ);
	ia.resize(N + 1);
	ja.clear();
	ja.reserve(ENNZ);

	for(i = 0; i < N; ++i)
	{
		A.push_back(r());
		ja.push_back(i + 1);
		ia = A.size();
		for(j = i + 1; j < N; ++j)
		{
			if(r.GetDouble() < p)
			{
				A.push_back(r());
				ja.push_back(j + 1);
			}
		}
	}
	ia = A.size() + 1;
}

void TestDTRSV(int N, double p)
{
	std::vector A, b(N), x(N), y(N);
	std::vector ia, ja;
	char uplo('U'), transa('N'), diag('N');

	MakeRandomSparseMatrix(N, p, A, ia, ja);
	std::generate(b.begin(), b.end(), RandomGenerator());

	mkl_dcsrtrsv(&uplo, &transa, &diag, &N, &A[0], &ia[0], &ja[0], &b[0], &x[0]);
	mkl_dcsrgemv(&transa, &N, &A[0], &ia[0], &ja[0], &x[0], &y[0]);
	cblas_daxpy(N, -1, &b[0], 1, &y[0], 1);
	std::cout << cblas_dnrm2(N, &y[0], 1) << '
';
}

int main()
{
	static const int N(4000);

	RandomGenerator::Seed();

	TestDTRSV(N, 6. / N);

	return EXIT_SUCCESS;
}

bill_woessner · ‎07-26-2006

In case anyone is interested, here is a drop-in replacement for mkl_dcsrtrsv:

void dcsrtrsv(const char*   uplo,
			  const char*   transa,
			  const char*   diag,
			  const int*    N,
			  const double* A,
			  const int*    ia,
			  const int*    ja,
			  double*       x)
{
	int i, j;

	if(std::toupper(*uplo) == 'U')
	{
		if(std::toupper(*transa) == 'N')
		{
			for(i = *N - 1; i >= 0; --i)
			{
				for(j = ia; j < ia[i + 1] - 1; ++j)
				{
					x -= A * x[ja - 1];
				}
				x /= A[ia - 1];
			}
		}
		else
		{
			for(i = 0; i < *N; ++i)
			{
				x /= A[ia - 1];
				for(j = ia; j < ia[i + 1] - 1; ++j)
				{
					x[ja - 1] -= A * x;
				}
			}
		}
	}
	else
	{
		if(std::toupper(*transa) == 'N')
		{
			for(i = 0; i < *N; ++i)
			{
				for(j = ia - 1; j < ia[i + 1] - 2; ++j)
				{
					x -= A * x[ja - 1];
				}
				x /= A[ia[i + 1] - 2];
			}
		}
		else
		{
			for(i = *N - 1; i >= 0; --i)
			{
				x /= A[ia[i + 1] - 2];
				for(j = ia - 1; j < ia[i + 1] - 2; ++j)
				{
					x[ja - 1] -= A * x;
				}
			}
		}
	}
}

void dcsrtrsv(const char*   uplo,
			  const char*   transa,
			  const char*   diag,
			  const int*    N,
			  const double* A,
			  const int*    ia,
			  const int*    ja,
			  const double* b,
			  double*       x)
{
	std::copy(b, b + *N, x);

	dcsrtrsv(uplo, transa, diag, N, A, ia, ja, x);
}

This code doesn't respect the 'diag' flag so you cannot pass it a matrix without its diagonal elements. I figured handling these 4 cases was good enough. :-)

--Bill