We recently transitioned our code to utilise the MKL data fitting spline and have now noticed a rather large issue with memory usage.
In our largest examples we can allocate in excess of 1e6 splines using the MKL spline library and we are seeing an additional overhead of ~100GB of RAM from just the MKL splines.
Previously we used our own code for the natural cubic spline and have compared directly between implementations to understand and isolate the memory increase. It seems that the additional memory is allocated within dfdNewTask1D.
The splines themselves are nothing to complicated or large, typically consist of ~27 knots and use the natural cubic spline method.
I have tried to disable the fast memory management, and to free unused memory for MKL as discussed at avoiding-memory-leaks-in-intel-mkl.html. These options has no impact on the overall memory usage.
Is there anything that can be done to reduce the memory overhead of the MKL data fitting splines?
Is this memory usage expected?
Yes, MKL is used elsewhere in the tool. For example, the LAPACK library.
Within the cubic spline code, I restrict the MKL include to just "mkl_df.h"
Glad to see new topics from you!
Could you please provide a little bit more details regarding next points:
>> In our largest examples we can allocate in excess of 1e6 splines using the MKL spline library and we are seeing an additional overhead of ~100GB of RAM from just the MKL splines.
Do you create a new Datafitting task for each spline? And do you use dfDeleteTask() routine to free the used memory when the task is not needed anymore?
Also it would be great to know Datafitting task parameters like “nx”, “ny”, “yhint”.
The answers can help me to understand/reproduce observed problem.
I'm glad you are happy to see me, I wasn't sure how keen you would be to hear from me again
Ok, let me answer your questions and give you some detail so that you can reproduce:
If you have any more questions then let me know.
thank you for the provided details.
Gennady and me are checking that on our side.
Preliminary I could say about 16GB are used for data fitting tasks in case of 1e6 splines.
Some temporary memory is also allocated during interpolation routine call but it is freed at the end of the call.
If all of the 1e6 splines use the same x-coordinates with different function values I could suggest to use single Datafitting task with the vector-valued function.
But anyway we continue investigating it.
Ok, so you don't seem to be seeing nearly the same issue with memory usage.
I probably should've mentioned above, but I am using MKL 2019_U5 .
What version of MKL are you guys testing with?
we tried to emulate the case you reported and measure the size of memory consumed by mkl. Please take a look at the example of the code attached.
Here are what we see ( mkl 2020 u4, openmp threaded version, lp64 mode, RH7):
$ ./a.out 1000000 <--- the input number of splines,
Number of Splines == 1000000
Peak memory allocated by Intel(R) MKL allocator : 17208004048 bytes.
it means that MKL allocates itself about 17 Gb of memory only.
You may give us the reproducer to see the 100 Gb memory consumed by mkl, in the case your usage model is different.
Thanks, I will have a look at the test case and try to repeat on my side.
The only thing I can immediately see differently is that I only use the serial MKL library. In our software we make heavy use of TBB parallelism and therefore try to stick with serial MKL functions.
Just to update as I'm on holiday next week..
Repeated your test case in my build environment and can confirm that I see the same memory usage of 16GB per 1e6 splines.
Have quickly thrown together a test case based on my MKL spline class (wrapper for your library calls), and also see the same memory usage.
So, will run another memory analysis using our full software to get more info on the memory issue. If possible, I will try to get either a reproducer, or at least a better understanding and work backwards from there.
TLDR Am on holiday next week, will pick this up and figure out what is unique about my test case when I return.