Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Memory management in vsl

Petros_M_
Beginner
781 Views

Hi,

I am using vsldSSEditCorParameterization in debug mode (windows) and have some problems with the windows debug heal validation step, that windows always performs on de-allocation, resulting in program crashing .

If I understand properly the documentation, ("spectral") one diagonalizes the input matrix. Presumably, this is done using LAPACK. If this is the case,LAPACK requires user-provided memory to perform the task, which since I do not provide, have to assume it is allocated internally.

Then the task should clean-this up in the end, (either upon completion of calculation, or at vslSSDeleteTask ) .

Now, again if I understand this correctly, this is serviced by the mkl memory allocator, which sounds like it creates its own heap (always windows).

I do not aim to the internals of vsl, but I have this bug that is very elusive, depending on where you call the allocation/.de-allocation vsl routines from, and have been hitting it for almost a day now (this is how I actually managed to determine that it is the memory issue).

Has anyone ever had this kind of problem? 

Am I missing something?  

Should I call mkl_thread_free_buffers before vslSSDeleteTask ? - I thought one should use it in case an exception is thrown to collect the resources.

Any kind of input would be greatly appreciated.

TIA,

Petros

 

0 Kudos
7 Replies
Andrey_N_Intel
Employee
781 Views

Hi Petros,

Please, provide the test-case and as many details as possible (OS/CPU/library version/build/link line etc) so we would be able to reproduce the issue on our side.

The general recommendation is to call the routine de-allocating memory resources after the last call to Intel MKL function, e.g., vslSSDeleteTask().

Andrey

0 Kudos
Petros_M_
Beginner
781 Views

Hi Andrey,

Thank you for the quick reply.

A code sample of the problem follows. One has 2 ways to create the task: either a user-provided one ( useClientSideTask = true ) or let the function create and destroy its own ( useClientSideTask  = false ), by toggling the appropriate variable. Apologies, for not making this command-line driven - too lazy I guess ;-)

The first one works like a charm. The second one "crashes" silently (does not print anything) most f the time.

contrary to what i thought, one does not even need to "cross" the boundary of a static lib (as I had it before) for this to happen.

Win7, VS2010, Debug,Xeon E5-2670, mkl11.3

TIA, P-

#include <mkl.h>
#include <iostream>
#include <boost/numeric/ublas/symmetric.hpp>

struct SampleAnalyzer {

	static
		void *
		create_task( MKL_INT nFactors ) {
			void * task ;
			MKL_INT status ;
			status = vsldSSNewTask( &task, &nFactors, 0, NULL, NULL, NULL, NULL ) ;
			return task ;
	}
	static
		void 
		destroy_task( void ** ptask ) {
			void * task = *ptask ;
			MKL_INT status = vslSSDeleteTask( &task ) ;
			mkl_thread_free_buffers() ;
	}
/************************************
	calculateNearestCorrelation :
************************************/
	static
		void
		calculateNearestCorrelation(
			MKL_INT		nF,
			double * 		pCorrelData,
			double const * 	pSymmData,
			void *			utask		= NULL 	// externally provided alternative :
		) {
			MKL_INT status ;
			MKL_INT stgFmt =  VSL_SS_MATRIX_STORAGE_L_PACKED ;
			void * task = ( utask == NULL ? create_task( nF ) : utask ) ;
			status = 
				vsldSSEditCorParameterization( task, pSymmData, &stgFmt, pCorrelData, &stgFmt ) ;
		    status = 
				vsldSSCompute( task, VSL_SS_PARAMTR_COR, VSL_SS_METHOD_SD ) ;
			if ( utask == NULL ) 
				destroy_task( &task ) ;
	}
} ;

int main() {
//	_CrtSetDbgFlag(_CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF | _CRTDBG_CHECK_ALWAYS_DF );
	const double one = double(1) ;
	typedef boost::numeric::ublas::symmetric_matrix<double> symm_matrix ;

	MKL_INT nFactors = 3 ;

	symm_matrix  symm( 3, 3 ) ; 
	symm(0,0) = one ;
	symm(1,1) = one ;
	symm(2,2) = one ;
	symm(1,0) = .25L ;	symm( 0, 1 ) = symm( 1, 0 ) ; 
	symm(2,0) = 0.6L ;  symm( 0, 2 ) = symm( 2, 0 ) ; 
	symm(2,1) = -0.8L ; symm( 1, 2 ) = symm( 2, 1 ) ; 
		
	symm_matrix corr( nFactors )  ;
	std::fill( corr.data().begin(), corr.data().end(), double(0) ) ;

	const size_t nF = size_t( nFactors ) ;

	double * const			pCorrelData = &corr( 0, 0 ) ;
	double const * const	pSymmData	= &symm( 0, 0 ) ;

// please, toggle this :
	bool useClientSideTask =  false ; // true ; // false ;
// MAIN CALL :
	{
		void * task = NULL ;
		if ( useClientSideTask )
			task = SampleAnalyzer::create_task( nFactors ) ; 
			
		SampleAnalyzer::calculateNearestCorrelation( nFactors, pCorrelData, pSymmData, task ) ;
		if ( useClientSideTask ) 
			SampleAnalyzer::destroy_task ( &task ) ;	
	}
	const double a = corr( 1, 0 ) ;
	const double b = corr( 2, 0 ) ;
	const double c = corr( 2, 1 ) ;

	const double leadingMinor1 = one - a * a ;
	const double leadingMinor2 = one - ( a * a + b * b + c * c )  + double(2) * a * b * c ;
	const bool isPositive = ( leadingMinor1 > double(0) ) && ( leadingMinor2 > double(0) ) ;
	std::string passed ( isPositive ? "passed" : "failed" ) ;
	std::cout << passed << std::endl ;
}

 

UPDATE :

After having run it a few times, it turns out that even the external task usage can fail, exhibiting an erratic behavior !

 

 

0 Kudos
Andrey_N_Intel
Employee
780 Views

Thanks, Petros, we will analyze it. Andrey

0 Kudos
Vladislav_V_Intel
780 Views

Hi Petros,
Thanks for providing the test case that we used for the analysis.
We run experiments using Intel MKL 11.3, Windows* 8, MSVS* 2013, and Boost* 1.56.
The example runs fine with 64 bit version of the library in LP64 mode.
If useClientSideTask parameter is set to true, the example also runs fine with 32-bit version of the library; otherwise it results into “heap corruption” on the stage of deconstruction of the Boost matrices on exit from the function main().

 

Do you see similar behavior of the test case in your environment under similar parameter settings and the library’s version?

 

Thanks,
Vlad

0 Kudos
Petros_M_
Beginner
781 Views

Hi Vlad,

Yes, pretty much.

A couple of observations:

  • have only run debug mode (do not know what happens on release) and only 32-bit.
  • To ensure you are aware of it, I posted a note later, in which I mention that, in the end, you will notice after a few runs with the user-provided task, that the problem appears there as well, although only erratically and not always
  • I have used boost ublas for many years (>7) and never had a memory issue (also many-many others, as you know)

Have you ever noticed the problem on Release built?

If yes (even erratically), then the windows  debug heap is not the issue (I do not see how the 64 vs. 32 bit can be of any relevance, in the same way I do not think that win8 vs. win7 is of any as well).

If I were to place a bet, however, I would think this to be the issue.

Update: Please, in case you have not done in a while, be reminded that heap corruption does not result in program crash at the offending point but only later, making things even more obscure.

0 Kudos
Vladislav_V_Intel
781 Views
Hi Petros, In our environment the example runs fine in 32-bit release mode. I also was able to reproduce the failure after the several runs you mentioned earlier: the example either crashes with “access violation” inside routine vsldSSCompute() or returns VSL_SS_ERROR_ALLOCATION_FAILURE error code. Another update is that the example runs memory corruption issue on Linux OS in 32-bit mode as well. The issue is being root-caused. Will keep you updated. Best regards, Vlad
0 Kudos
Petros_M_
Beginner
781 Views

Hi Vlad,

Thank you for the update.

So it seems, if I understand you well that there is some issue within the vsl code, since it appears both on Linux and Windows.

In a strange way this is very good news, because the other alternative would be much-much harder to crack ;-)

All the Best, P-

 

0 Kudos
Reply