I build a small C# project that calculate some VML functions:
Abs , Arg, Add, Mult, Cos, Sin
I ran this functions in multi threads - about 3000 threads using .Net threadpool that use only 4 threads.
Sometimes an exception occur : "Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
I attached the project and the exception screenshot.
I cannot reproduce your issue, but I have an idea on what could go wrong.
In your code you call the VML functions with mode parameter [last]. That mode parameter has to be 64-bit integer (please see the documentation) and your code declares functions as taking 32-bit integer instead. Since we use cdecl convention we are reading the arguments from the stack. So our functions may be referencing the memory that you haven't allocated or may be touching whatever the security cookie that .NET compiler places there. Which would explain the exception. I may have a different version of compiler and may not have the same security circumstances as you have.
So please try to correct the error in your code and see if it helps.Thanks,
I continue the testing with v?Abs and v?Arg and it seems that vsAbs and vsArg works fine, but when I use the vcAbs and vcArg I get the exception.
Is there a way to transfer 2 float arrays instead of 1 MKL_Complex8 array?
I'm attaching the project after some modification - now we call only vcAbs and vcArgs.
I ran it on 64 bit platform.
* The exception not always thrown in the first run - usually it takes 4-5 runs till we get the exception.
I took a look at your new code and it still works fine for me. I guess at this point I need to know the version of visual studio that you use and ask you try and find the minimum vector lengths and number of threads at which the exceptions still occur. I'd also suggest to play with internal threading of VML: try to disable it by setting environment variable: set MKL_NUM_THREADS=1.
To quickly answer your other question: yes you can use v?Hypot function instead of complex absolute value. And you can use v?Atan2 instead of complex argument. These accept real arrays.
Thank you for the replay.
I replaced the Abs and Arg functions with Hypot and Atan2 and that works fine for me.
Just another last question to make sure - if I want to use the mode for the fastest calculation (less accurate) I should use 3 correct?
yes, currently VML_EP mode is the fastest and it is set to 3. Of course I'd encourage you to use VML header files and macros as much as you can to avoid problems with updates to new library releases. One more aspect if you want maximum performance: mode parameter passed to every function call is meant for those who need granular control over functions' accuracy. If you are fine with EP mode everywhere, I'd suggest that you call vmlSetMode() function once before calling any of the computational functions and then refrain from calling the functions with mode parameter - this shall save some start-up time for VML. You should align your arrays to at least 32-bytes for best performance on AVX/AVX2 capable machines. You may also want to reuse your buffers and pass the same pointer as input and output array (e.g. Atan2 could return its result overwriting either x or y input vector - just make sure the sizes of arrays agree).
Thank you for the good tips.
Regarding the Mode configuration of the VML - all the VMLNative functions that I use are static, if I configure Mode value once does it tak effect for all the VMLNative function calls?
When I use the DFTI library I create Descriptor, configure it and use it for all the FFT and IFFT operations.
In VML library I found there is no use for such descriptor.
I discovered that the VML library doesn't free the memory - you can try running the attached project and see that although I allocated memory on the stack, the memory keeps growing constantly. I suspect the the VML library doesn't free its allocated data.
Is there any function to call in order to free the allocated memory?
I have tried to run MKLFreeTls function but without success - can you give me some example of how to use this function.
On the same subject I tried to limit the number of threads MKL is using by calling the function mkl_set_num_threads but I keep on getting this error :
Attempted to read or write protected memory. This is often an indication that other memory is corrupt. any idea why? (I set 4 as value)
I do not see the memory growing constantly. In my case it grows up to maybe 18MB and then it stops and clears. Actually VML is not supposed to allocate memory - we use the memory that you pass to us via pointers.
You may want to use mkl_thread_free_buffers() after you finished with MKL calls in a current thread or mkl_free_buffers() after you are done with MKL in your app as a whole. I don't think MKLFreeTLS is for your case.
Have you tried to take VML out of equation and supply your own functions instead? e.g. write a trivial loop in native C that would emulate VML behavior. Just to see if the exceptions still occur?
Every time I try to call one of this functions:
I get and exception : "Attempted to read or write protected memory. This is often an indication that other memory is corrupt."
All other function works fine - do you know why I get this exception? am I using the MKL correctly?
We are really looking forward to integrate MKL in our project but it seems that we need some support.
Is there an option to do a remote session so we can show the support all our issues:
* increasing memory.
* increasing number of threads.
We are willing to pay for this support.
Now this is something I can reproduce. If I modify your code and call one of the above functions it signals exceptions even if no other calls to MKL. I will pass a word on this to other developers, so hopefully they can get back to you soon.
We are still having some issues regarding the increasing threads and memory usage.
I'm attaching a project so you can see the growth of the memory and the threads .
At the end of the project the threads and memory doesn't free as I expected.
Please run this project and try to reproduce it.
Do you have any suggestion of how to reduce the memory and threads after we finish using the library.
What version of MKL you are using? if this is 11.2 ,then you have to receive the notification when update 4 of MKL 11.2 has been releasead ( a few weeks ago) and follow the link into this notification, you may download this update.