Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

MKL DGEMM thread safety

aurora
Beginner
875 Views

 

When I run my threaded application (several threads calling Fortran subroutines that use MKL lapack function DGEMM), Im getting the "DGEMM parameter number x had an illegal value" where X could be 8, 10 ...and also 0! Im sure that Im not using shared memory among the DGEMM  calls. Could be this a heap corruption? How can I figure out what is going on (only reproduced once in a thousand execution, for example)

Thanks in advance

0 Kudos
6 Replies
aurora
Beginner
875 Views

Sorry,

Tested in linux 64 bits, with Intel Fortrans 12.1 MKL and also the last MKL. Im compiling with mkl_intel_thread and using mkl_domain_set_num_threads(0, MKL_ALL)

Why can`t I edit my own posts 10 minutes later?

0 Kudos
Zhang_Z_Intel
Employee
875 Views

Did you dynamically memory for the data passed to DGEMM?

You can try to use "Intel Inspector" to uncover many memory related bugs. You can download a 30-day fully functional trial version if you haven't purchased the license: https://software.intel.com/en-us/intel-inspector-xe

 

 

0 Kudos
aurora
Beginner
875 Views

Yes,

 

I've used Inspector (in Windows) for detecting data races. It shows me some MKL data races for example in two threads running this Fortran code:

allocate(M_INV(A,A))

! FILL M_INV
!  .....

CALL DPOTRI( 'U', N, M_INV, A, INFO )  <- Data race here

 

Could be this an installation problem?

0 Kudos
TimP
Honored Contributor III
875 Views

You would need to assure that each instance of the MKL call has its own threadprivate copy of the procedure arguments which differ among threads or may be modified within MKL.   Otherwise, you would need to put a critical or single around the suspect MKL call.

0 Kudos
aurora
Beginner
875 Views

Yes, I know, in the case of DPOTRI, the arguments are characters, integers, numbers, and one  double array that Im allocating just before the call, so I think that there is not shared memory here.

If then I do deallocate of M_INV and another thread does allocate(M_INV), could be the runtime giving the same memory position for the allocate so the Inspector is detecting a data race?

Why is the runtime telling me that the parameter 0 is an illegal argument?

0 Kudos
aurora
Beginner
875 Views

I've tried inserting all DGEMM calls (there are many more LAPACK calls like DPOTRI) inside criticals, and now there is no error.

I'm using mkl_intel_thread version. Why this is only happening with DGEMM? Inspector doesnt tell me anything about DGEMM calls..

0 Kudos
Reply