Hello, I've been using the JIT API on Windows and the result files generated (as stored on disk under the VTune project's directory) are about 300mb in size. However, after data collection, analyzing these result files causes VTune to use in excess of 12gb of memory. Eventually have to kill VTune because paging makes the machine unusable. For the sake of experimentation, once I let the machine go ahead and thrash the paging file. VTune wasn't done analyzing the results even with 30gb of memory usage.
This issue does not happen with every data collection run. In practice, what seems to work is to limit collection so the application only runs for a limited time. But even this approach is not guaranteed to produce results that can be opened with a reasonable amount of memory. The behavior is somewhat unpredictable. Sometimes runs on the order of 5 minutes can be examined ok, while some other times runs of about 30 seconds cannot be analyzed. On one occasion, VTune also complained the system was out of virtual memory --- but the task manager showed that was not even closely the case.
Googling for answers to this issue is difficult because VTune is used to measure application memory usage, so search keywords clash badly. I did not find interesting hits when searching this forum for topics related to the JIT API. I also scanned recent forum topics and sticky announcements. I'd like to believe this behavior is somehow connected to the JIT API only (i.e., this is a problem specific to something I am doing specifically).
I also examined the result files, and saw they are not compressed. Why would VTune need memory allocation roughly two orders of magnitude larger than the uncompressed result files it's trying to load (~300mb => 30gb+)? Does this sound like a known issue? Am I just using the JIT API wrong? At one point I suspected this could be related to not invalidating JITted code, however VTune's documentation says method invalidation happens automatically when notyfing VTune of a new method that overlaps a previously JITted method. What else could I do to diagnose the problem?
Yes, it could be directly related to *how* you are using the API. The best thing is for you to submit an issue on Intel® Premier Support and let us work with you via our free, secure support mechanism to analyze what you are doing with the API.
Thanks, after some experiments I am still seeing seemingly excessive memory usage and hangs. Unfortunately, Firefox reports https://premier.intel.com/ is using a certificate that expired on June 25. Is this actually the case?
Which version of VTune are you using? Some of problems related to huge memory consumption on JIT result finalization had been fixed. It makes sense to check it using the latest available version of VTune. As an option, you can share your results with me to check it on our side
Hello... at the time the measurements and experiments were done with VTune 2016 update 4 build 470476. What version are you saying issues regarding memory usage with JIT result finalization were fixed in? Is there written text in the relevant release notes about these improvements?
I recently tried the same test case with VTune 2017. I am running into the same problems. I'd love any indication of what I might be doing wrong, it's really hard to figure out what to do without feedback.
I opened a ticked on this issue last year. The support engineer agreed there was something strange going on after examining the uploaded result files. I haven't heard anything further from support for over six months.
Hi Andres, Did you open the ticket on Intel Premier Support? We have updated it to the Online Service Center (supporttickets.intel.com/), you may submit tickets here and our engineers will support you timely regrading your issues. Thanks.
Yes, I opened the ticket on Intel Premier Support about a year ago. It seems like the ticket from Premier Support transferred over to the new system, however I haven't seen activity on it since February of this year.
By the way, the memory pressure and CPU analysis issue got some relief on my end because the generated code shape changed and now emits way fewer polymorphic inline caches. Note I wasn't expecting this improvement in VTune's experience. If this could be improved further, that would be ideal. What do you suggest I do? Should I dump the old ticket and start a new one?
Sometimes I think having some feedback from VTune itself along the lines of "you know, the analysis is very slow because the number of code units is very large", or some way to check that my usage of the JIT profiling API is as intended, would be useful. Is there a debug log VTune can dump so I can see what's going on a bit better? What available information could I have missed?