- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a C++ application which is coded this way:
- The main program does not need much memory (just a few variables). But this main program runs a loop in which we call a function.
- This function needs about 140 MB of memory to run. The memory is allocated in the function and then released (using RAII).
When I run this program overnight on OSX, here is the data I get from "Activity Monitor", or "top" in terms of memory consumption
- After the first loop, the program takes 150 MB of memory
- After 68 loops, the program takes 220 MB of memory
- After 394 loops, the program takes 480 MB of memory
So it seems that the function, which allocates and deallocated 140 MB of memory, "leaks" about 1 MB each time it is called. In this function, the allocated objects are:
- My own version of std::vector which I call il::Vector, il::Matrix, il::Tensor. I have used these class in other codes and they seem fine.
- A class that calls Pardiso from the MKL. Using RAII, I take care of properly deallocating Pardiso memory before I destroy the class (using the phase -1).
I have used Pointer Checker from Intel (on a Linux workstation) and Address Sanitizer from Clang on the program (with smaller inputs though) and they don't detect anything. I don't really know what to do. Is there a way memory fragmentation is responsible for this?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think that I have found the culprit: the MKL. I use Pardiso, and the following example leaks very slowly: about 0.1 MB every 13 seconds which leads to 280 MB overnight. These are the numbers I get from my simulation.
If you want to give it a try, you can compile it with:
icpc -std=c++11 pardiso-leak.cpp -o main -lmkl_intel_lp64 -lmkl_core -lmkl_intel_thread -liomp5 -ldl -lpthread -lm
The short version of the code that leaks memory is:
#include <iostream> #include <vector> #include "mkl_pardiso.h" #include "mkl_types.h" int main (int argc, char const *argv[]) { const auto n = std::size_t{1000}; auto m = MKL_INT{n * n}; auto values = std::vector<double>(); auto column = std::vector<MKL_INT>(); auto row = std::vector<MKL_INT>(); row.push_back(1); for(std::size_t j = 0; j < n; ++j) { column.push_back(j + 1); values.push_back(1.0); column.push_back(j + n + 1); values.push_back(0.1); row.push_back(column.size() + 1); } for(std::size_t i = 1; i < n - 1; ++i) { for(std::size_t j = 0; j < n; ++j) { column.push_back(n * i + j - n + 1); values.push_back(0.1); column.push_back(n * i + j + 1); values.push_back(1.0); column.push_back(n * i + j + n + 1); values.push_back(0.1); row.push_back(column.size() + 1); } } for(std::size_t j = 0; j < n; ++j) { column.push_back((n - 1) * n + j - n + 1); values.push_back(0.1); column.push_back((n - 1) * n + j + 1); values.push_back(1.0); row.push_back(column.size() + 1); } auto y = std::vector<double>(m, 1.0); auto x = std::vector<double>(m, 0.0); auto pardiso_nrhs = MKL_INT{1}; auto pardiso_max_fact = MKL_INT{1}; auto pardiso_mnum = MKL_INT{1}; auto pardiso_mtype = MKL_INT{11}; auto pardiso_msglvl = MKL_INT{0}; MKL_INT pardiso_iparm[64]; for (int i = 0; i < 64; ++i) { pardiso_iparm = 0; } pardiso_iparm[0] = 1; pardiso_iparm[1] = 2; pardiso_iparm[3] = 0; pardiso_iparm[4] = 0; pardiso_iparm[5] = 0; pardiso_iparm[7] = 0; pardiso_iparm[8] = 0; pardiso_iparm[9] = 13; pardiso_iparm[10] = 1; pardiso_iparm[11] = 0; pardiso_iparm[12] = 1; pardiso_iparm[17] = -1; pardiso_iparm[18] = 0; pardiso_iparm[20] = 0; pardiso_iparm[23] = 1; pardiso_iparm[24] = 0; pardiso_iparm[26] = 0; pardiso_iparm[27] = 0; pardiso_iparm[30] = 0; pardiso_iparm[31] = 0; pardiso_iparm[32] = 0; pardiso_iparm[33] = 0; pardiso_iparm[34] = 0; pardiso_iparm[59] = 0; pardiso_iparm[60] = 0; pardiso_iparm[61] = 0; pardiso_iparm[62] = 0; pardiso_iparm[63] = 0; void* pardiso_pt[64]; for (int i = 0; i < 64; ++i) { pardiso_pt = nullptr; } auto error = MKL_INT{0}; auto phase = MKL_INT{11}; MKL_INT i_dummy; double d_dummy; PARDISO(pardiso_pt, &pardiso_max_fact, &pardiso_mnum, &pardiso_mtype, &phase, &m, values.data(), row.data(), column.data(), &i_dummy, &pardiso_nrhs, pardiso_iparm, &pardiso_msglvl, &d_dummy, &d_dummy, &error); phase = 22; PARDISO(pardiso_pt, &pardiso_max_fact, &pardiso_mnum, &pardiso_mtype, &phase, &m, values.data(), row.data(), column.data(), &i_dummy, &pardiso_nrhs, pardiso_iparm, &pardiso_msglvl, &d_dummy, &d_dummy, &error); phase = 33; for(size_t i = 0; i < 10000; ++i) { std::cout << "i = " << i << std::endl; PARDISO(pardiso_pt, &pardiso_max_fact, &pardiso_mnum, &pardiso_mtype, &phase, &m, values.data(), row.data(), column.data(), &i_dummy, &pardiso_nrhs, pardiso_iparm, &pardiso_msglvl, y.data(), x.data(), &error); } phase = -1; PARDISO(pardiso_pt, &pardiso_max_fact, &pardiso_mnum, &pardiso_mtype, &phase, &m, values.data(), row.data(), column.data(), &i_dummy, &pardiso_nrhs, pardiso_iparm, &pardiso_msglvl, &d_dummy, &d_dummy, &error); return 0; }
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'll let the MKL team know on this issue with memory leak, appreciate much.
_Kittur
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi velvia
We will investigate the problem. Which Icc and MKL version (Parallel Stuidio XE for Mac OS) are you using?
Intel MKL Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
fayard@Speed:Fast$ icc --version icc (ICC) 15.0.2 20150121 Copyright (C) 1985-2015 Intel Corporation. All rights reserved.
I use the MKL that ships with this version of the compiler.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi velvia,
We try run the code on linux and windows, but don’t see any memory leaks if call MKL_FREE_Buffers. We run our own test that calculated size of allocated memory. And on windows, I try the Intel Amplifer Xe also and saw there are memory leak in the STL vector, (auto values = std::vector<double>();
Could you rerun your test with some vector memory release and mkl_free_buffers()?
For example,
std::vector<double>().swap(values);
and mkl_free_buffers()
return
0;
to verify you result? If it will positive, I will move to Mac Os based machine.
Best Regards,
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Velvia,
1) No, I haven't seen the memory usage increase.
I run the small code (your original code, no mkl_free_buffers) on one Mac OS. I attach two screencopy ( I=400, I= for your reference. )
1.1. Are you call the code in parallel region. Anyway, you can add mkl_free_buffers() after the function call and see if there is any change.
2) I'm not sure, but from Inspector XE, It show there is memory leak at line 33
row.push_back(1);
for(std::size_t j = 0; j < n; ++j) {
column.push_back(j + 1);
line 33. values.push_back(1.0);
column.push_back(j + n + 1);
values.push_back(0.1);
row.push_back(column.size() + 1);
}
I attach the screen shot for your reference.
Regards,
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page