- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I run my threaded application (several threads calling Fortran subroutines that use MKL lapack function DGEMM), Im getting the "DGEMM parameter number x had an illegal value" where X could be 8, 10 ...and also 0! Im sure that Im not using shared memory among the DGEMM calls. Could be this a heap corruption? How can I figure out what is going on (only reproduced once in a thousand execution, for example)
Thanks in advance
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry,
Tested in linux 64 bits, with Intel Fortrans 12.1 MKL and also the last MKL. Im compiling with mkl_intel_thread and using mkl_domain_set_num_threads(0, MKL_ALL)
Why can`t I edit my own posts 10 minutes later?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you dynamically memory for the data passed to DGEMM?
You can try to use "Intel Inspector" to uncover many memory related bugs. You can download a 30-day fully functional trial version if you haven't purchased the license: https://software.intel.com/en-us/intel-inspector-xe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes,
I've used Inspector (in Windows) for detecting data races. It shows me some MKL data races for example in two threads running this Fortran code:
allocate(M_INV(A,A))
! FILL M_INV
! .....
CALL DPOTRI( 'U', N, M_INV, A, INFO ) <- Data race here
Could be this an installation problem?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You would need to assure that each instance of the MKL call has its own threadprivate copy of the procedure arguments which differ among threads or may be modified within MKL. Otherwise, you would need to put a critical or single around the suspect MKL call.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I know, in the case of DPOTRI, the arguments are characters, integers, numbers, and one double array that Im allocating just before the call, so I think that there is not shared memory here.
If then I do deallocate of M_INV and another thread does allocate(M_INV), could be the runtime giving the same memory position for the allocate so the Inspector is detecting a data race?
Why is the runtime telling me that the parameter 0 is an illegal argument?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've tried inserting all DGEMM calls (there are many more LAPACK calls like DPOTRI) inside criticals, and now there is no error.
I'm using mkl_intel_thread version. Why this is only happening with DGEMM? Inspector doesnt tell me anything about DGEMM calls..
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page