I'm developing an asynchronous Windows application and have noticed a strange loss of system memory. My application internally tracks memory usage, and when not using OpenCL at all it matches what is reported by the system through taskmgr. What's curious is the memory leak is more or less depending on what OpenCL version and device I use. Summarizing what taskmgr reports:
No OpenCL (vanilla C code) - ~8MB
OpenCL 2.0 Experimental CPU ~ 1.2 GB
OpenCL 1.2 CPU ~ 350 MB
OpenCL 1.2 GPU (HD 4600) ~ 40 MB
I've checked that events are being released in a timely manner (props to INDE), and seeing this much variability in the leak across OpenCL implementations and device types makes me think my application isn't solely behind the leak.
Any suggestions on what might be causing it aside from a bug in the OpenCL implementation?
Do you have a reproducer that I can submit to the development teams? I also need to know your OS, processor, RAM, driver version, INDE version and any system details that you can summon.
There are a lot of objects in OpenCL that need to be released at one point or the other, including kernels, buffers, images, queues, programs, contexts, etc.
If you could provide a way for me to reproduce the issue you are seeing, I will go ahead and submit the bug report.
I'm working on putting one together. Was hoping for a magic fix since a reproducer isn't exactly straight forward for this application...
The only things not released during run-time (but are at shutdown) are of course the command queues, contexts, and kernel objects. Buffers and events are released on the fly as they are completed/no longer needed, and INDE shows this behavior working correctly via object view in the API debugger. Counting the number of buffers shown as not-released by INDE matches what my applications internal tracer reports, so both tools appear to be in agreement about how much memory is in use. Task manager says otherwise, and of course it's a much better indicator of whats actually going on.
Hopefully I'll have something for you within the next day or two.
I ran the reproducer on the latest version of the driver (May 22nd, 2015) with 17000 iterations instead of the original 1700 and the code runs stably: don't see increase in memory, so don't think there is a memory leak anymore. Please let me know if you think otherwise.