Community
cancel
Showing results for 
Search instead for 
Did you mean: 
JWagner
New Contributor I
487 Views

mkl_sparse_convert_csr: Memory ownership and memory leak

Jump to solution

Hello,

I read the discussion (https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Releasing-Sparse-Matrix-Handle-and-A... where the memory ownership of arrays for created matrices was desicussed. Very nice describtion btw. I run into a memory leak issue and I almost solved it, but sadly not completely. My code for testing purposes is the following (Compiled with the Intel DPC++ Compiler and linked with the sequential mkl library):

const int _blocksize = 262144;
void CreateMaterialMatrix(float materialValue, float freeSpaceValue, sparse_matrix_t & materialMatrix)
{
	int * row_indx = new int[_blocksize];
	float * values = new float[_blocksize];
	int * col_indx = new int[_blocksize];

	float value = 1 / (materialValue*freeSpaceValue);

	for (int i = 0; i < _blocksize; i++)
	{
		values[i] = value;
		col_indx[i] = i;
		row_indx[i] = i;
	}
	sparse_index_base_t indexSchema = SPARSE_INDEX_BASE_ZERO;
	sparse_status_t stat = mkl_sparse_s_create_coo(&materialMatrix, indexSchema, _blocksize, _blocksize, _blocksize, row_indx, col_indx, values);

	mkl_sparse_convert_csr(materialMatrix, SPARSE_OPERATION_NON_TRANSPOSE, &materialMatrix);

	delete[] row_indx;
	delete[] values;
	delete[] col_indx;
	row_indx = nullptr;
	values = nullptr;
	col_indx = nullptr;
}

void CreateAndDestroyMatrix()
{
	int N_AllocatedBuffers;
	float materialValue = 1.f;
	float freespaceValue = 1.256637061E-6f;
	sparse_matrix_t materialMatrix;
	CreateMaterialMatrix(materialValue,freespaceValue, materialMatrix);
	mkl_sparse_destroy(materialMatrix);
	printf("After destroy: %d bytes in %d buffers\n",mkl_mem_stat(&N_AllocatedBuffers), N_AllocatedBuffers);
	mkl_free_buffers();
	printf("After free buffer: %d bytes in %d buffers\n", mkl_mem_stat(&N_AllocatedBuffers), N_AllocatedBuffers);
}
int main()
{
	//int success = mkl_disable_fast_mm();
	//std::string result = success == 1 ? "The Memory Allocator is successfully turned off" : "Turning the Memory Allocator off failed.";
	//std::cout << result << std::endl;
	int N_AllocatedBuffers;
	for (int x = 0; x < 1000; x++)
	{
		CreateAndDestroyMatrix();
	}
	int AllocatedBytes = mkl_mem_stat(&N_AllocatedBuffers);
	if (AllocatedBytes > 0) {
		printf("\nMKL memory leak!");
		printf("\nAfter mkl_free_buffers there are %d bytes in %d buffers",
			AllocatedBytes, N_AllocatedBuffers);
	}
}

 

As I understood it, the mkl_sparse_convert_csr method creates new arrays (rows_end, etc.), therefore the original arrays, which I used to create the preceding matrix in coo format, should be deallocated. This indeed solved most of my memory leakage issue. But if I compare the heap memory before and after the loop in main, I still get an increase of 12 MB. There is also an increase if I don't use the mkl_sparse_convert_csr method, but in this case it is only 35 KB.  Thus my first question is, where does this come from and is there a way to fix it?

I figgured that mkl_free_buffers might also help, so I investigated its influence on the example. Even though I call in the end of my CreateAndDestroyMatrix function, it seems not to have any influence. The buffer size I get via mkl_mem_stat is increasing with each itteration and ends with: 12568000 bytes in 3000 buffers. I may have not used the mkl_malloc functions but I thought if the mkl_sparse_convert_csr  method creates its own arrays, the MKL Memory Allocator might be involved. Therefore I would love to know if mkl_sparse_convert_csr really allocates new memory and why I cannot free the memory buffer via mkl_free_buffers ?

Thank you,

Jan

1 Solution
Khang_N_Intel
Employee
263 Views

Hi Jan,


Our MKL team is working very hard and was able to quickly identify the cause of the issue and is currently working on the fix for it. We hope to have the fix out very soon.


Note that this issue only occurs in oneMKL 2021.2 and 2021.3.


Best regards,

Khang


View solution in original post

8 Replies
Kirill_V_Intel
Employee
477 Views

Hi Jan!

I haven't checked manually yet but I'm pretty sure the reason is that you are calling mkl_sparse_convert_csr in-place, with the output matrix handle being the same as the input matrix handle.

Our documentation allows it, which is wrong I think.

Best,
Kirill

JWagner
New Contributor I
443 Views

Hi Kirill,

that helped indeed. The memory increases now only by 4 KB per iteration.

Regards,

Jan

Kirill_V_Intel
Employee
430 Views

Hi Jan,

That's an improvement! However I'm pretty sure there should be no increase for the situation you describe. 

I would make a wild guess that maybe you forgot to call mkl_sparse_destroy on the matrix handle in COO format (so you clean up coordinate arrays, and the csr matrix handle, but some leftovers for the coordinate matrix handle (without arrays) created inside CreateMaterialMatrix are getting accumulated.

If my guess is incorrect, please post/attach the updated code after you switched to out-of-place conversion.

Best,
Kirill

JWagner
New Contributor I
411 Views

Hi Kirill,

I do distroy the matrix handle of the coo format. For a better understanding I attached you my code after I fixed the matrix creation.

In my main function I simply call the function CreateAndDestroyMatrix repeatadly and analyse the memory behaviour before and after the function call with heap snapshots made via Visual Studio's (2017) Diagnostic Tools.

Regards,

Jan

Khang_N_Intel
Employee
291 Views

Hi Jan,


I tried your code on 2 different systems and on oneMKL 2021.2 and oneMKL 2021.3.

I was able to reproduce the issues. I will escalate this issue shortly.


Best regards,

Khang


Khang_N_Intel
Employee
264 Views

Hi Jan,


Our MKL team is working very hard and was able to quickly identify the cause of the issue and is currently working on the fix for it. We hope to have the fix out very soon.


Note that this issue only occurs in oneMKL 2021.2 and 2021.3.


Best regards,

Khang


View solution in original post

JWagner
New Contributor I
233 Views

Hi Khang,

thank you for you outstanding collaboration. Since it is bug that is gonna be fixed with a feature update, I gonna mark this topic as solved.

Regards,

Jan

Khang_N_Intel
Employee
190 Views

Hi Jan,


Thank you for your kind word!

Our engineer has been very hard on the solution for this issue (thanks Kirill). The fix will be in the next release of oneMKL, 2021.4.


Best regards,

Khang


Reply