- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm working on a C project with dynamic code generation at startup. Windows 11, VTune 2024.02.
We use jitprofiling.h & jitprofiling.lib to provide dynamic code information once jit finished, which works well when using VTune to launch the application. However, when we launch the application separately and attach VTune by process name or pid, VTune does not display dynamic code information (other code information such as C code are still valid).
Some trouble shooting talk about INTEL_LIBITTNOTIFY64 and INTEL_JIT_PROFILER64 (note that we do not use libittnotify in the C code, and linked to libittnotify.lib in addition does not help either). Here is my setup:
INTEL_LIBITTNOTIFY32=C:\Program Files (x86)\Intel\oneAPI\vtune\2024.2\bin32\runtime\ittnotify_collector.dll
INTEL_LIBITTNOTIFY64=C:\Program Files (x86)\Intel\oneAPI\vtune\2024.2\bin64\runtime\ittnotify_collector.dll
INTEL_JIT_PROFILER32=C:\Program Files (x86)\Intel\oneAPI\vtune\2024.2\bin32\runtime\ittnotify_collector.dll
INTEL_JIT_PROFILER64=C:\Program Files (x86)\Intel\oneAPI\vtune\2024.2\bin64\runtime\ittnotify_collector.dll
With these environment variables set (especially INTEL_JIT_PROFILER64), on attach to process, iJIT_IsProfilingActive retuns iJIT_SAMPLING_ON as expected. However,
iJIT_NotifyEvent(iJVM_EVENT_TYPE_METHOD_LOAD_FINISHED,(void*)&data);
Does not return 0 (the value returned when we're in launch mode), but some value similar to an address. And VTune does not have dynamic code information (e.g. hotspots mode, bottom-up, no function listed for generated code, probably in [Outside any known module]).
It's very confusing and greatly impact what we're able to do with VTune, as most of the runtime code are dynamic generated. Some team members say it worked before - I don't know when, but VTune support code are added in 2023 in the project.
I would like to know if I'm missing something? Or is it a VTune bug? I can provide more information if needed.
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For me, iJIT_IsProfilingActive also returns correct value if I have INTEL_JIT_PROFILER64 and without VTune. However, if VTune is attached before calling iJIT_IsProfilingActive, iJIT_IsProfilingActive returns iJIT_NOTHING_RUNNING.
Also, attach VTune after calling iJIT_IsProfilingActive does not work either (which is the original discussion of this thread).
Yes, jit code executes only after step 2.5.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you set INTEL_JIT_PROFILER64 before running the application/process?
If I don't set INTEL_JIT_PROFILER64 before the process runs, the iJIT_IsProfilingActive returns iJIT_NOTHING_RUNNING, even VTune launches.
If I set INTEL_JIT_PROFILER64 before the process runs, the iJIT_IsProfilingActive returns iJIT_SAMPLING_ON regardless of whether VTune launches.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I have INTEL_JIT_PROFILER64 (step 1) before running the process.
I have the same behavior as yours if I don't set INTEL_JIT_PROFILER64 before the process runs.
If I set INTEL_JIT_PROFILER64 before the process runs (as in all my previous tests), of cause before running any generated code, there is different situations:
- if VTune is not attached: iJIT_IsProfilingActive returns iJIT_SAMPLING_ON, iJIT_NotifyEvent returns 1
- if VTune is attached: iJIT_IsProfilingActive returns iJIT_NOTHING_RUNNING, iJIT_NotifyEvent returns 0
- if I call iJIT_IsProfilingActive once (return iJIT_SAMPLING_ON), then I attach VTune, then I re-call iJIT_IsProfilingActive / iJIT_NotifyEvent: iJIT_IsProfilingActive returns iJIT_SAMPLING_ON, iJIT_NotifyEvent returns 0
However, VTune does not detect the generated code in the above situations either.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the example. But in the code profile.sh I can only see you launch ./memfunc with VTune instead of attach? (i.e. not with -target-process / -target-pid )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can change the command line to the following, it can also work. I just verified, the same result is observed.
$ vtune -collect hotspots -knob sampling-mode=hw -knob enable-stack-collection=true --target-pid xxxxx
Test steps:
1. set INTEL_JIT_PROFILER64 and run the memfunc binary in one console window
2. run VTune command line above in another window
3. click 'enter' in the first console window, the binary will continue
4. the second window will show the result, and you can see [Dynamic code] information below.
Top Hotspots
Function Module CPU Time % of CPU Time(%)
------------- -------------- -------- ----------------
rand libc.so.6 86.458s 94.5%
main memfunc 2.219s 2.4%
func@0x402070 memfunc 1.779s 1.9%
JITed [Dynamic code] 0.516s 0.6%
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you a lot for you help! I just find out that I was using User-mode sampling (which does not work); and when configuring as yours, i.e. using Hardware-event based sampling + collect stacks, it works and I can now see my dynamic functions!
As VTune launch seems to works in user-mode I didn't realized that it could make differences (is that a bug or it is intended?).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Both sw and hw sampling modes should work for the case, you can see my test result, I can get dynamic code information.
See if your dynamic code runs enough time; otherwise, VTune may not collect data.
Console window 1:
yuzhang3@yuzhang3-10710:~/workspace/isvc_jira_ips/jit$ ./memfunc
Any key press here
64bit linux
JIT profiling
Console window 2:
yuzhang3@yuzhang3-10710:~/workspace/isvc_jira_ips/jit$ vtune -collect hotspots -knob sampling-mode=sw -knob enable-stack-collection=true --target-pid 2105436
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /home/yuzhang3/workspace/isvc_jira_ips/jit/r006hs -command stop.
vtune: Collection detached.
vtune: Collection stopped.
vtune: Using result path `/home/yuzhang3/workspace/isvc_jira_ips/jit/r006hs'
vtune: Executing actions 75 % Generating a report Elapsed Time: 96.031s
CPU Time: 91.440s
Effective Time: 91.440s
Spin Time: 0s
Overhead Time: 0s
Total Thread Count: 1
Paused Time: 0s
Top Hotspots
Function Module CPU Time % of CPU Time(%)
------------- -------------- -------- ----------------
rand libc.so.6 86.352s 94.4%
main memfunc 2.472s 2.7%
func@0x402070 memfunc 1.576s 1.7%
JITed [Dynamic code] 0.680s 0.7%
A::A memfunc 0.360s 0.4%
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm on Windows and I didn't have the environment to test your example yet, but reading your console result it seems to work fine. I don't know what problem it is on my side, I just know that my dynamic code occupied enough time to be displayed in launch + user mode sampling, or in attach + hardware based sampling (the first ones appeared in 20th-30th in the list by CPU time). Anyway, I think use hw sampling is a good enough solution for me now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sure. I am glad this solution is helpful to you.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »