Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5106 Discussions

VTune Jit profiling no information when attach to process, even with INTEL_LIBITTNOTIFY64 set

ymao
Novice
2,196 Views

Hello,

I'm working on a C project with dynamic code generation at startup. Windows 11, VTune 2024.02.

 

We use jitprofiling.h & jitprofiling.lib to provide dynamic code information once jit finished, which works well when using VTune to launch the application. However, when we launch the application separately and attach VTune by process name or pid, VTune does not display dynamic code information (other code information such as C code are still valid).

 

Some trouble shooting talk about INTEL_LIBITTNOTIFY64 and INTEL_JIT_PROFILER64 (note that we do not use libittnotify in the C code, and linked to libittnotify.lib in addition does not help either). Here is my setup:

 

INTEL_LIBITTNOTIFY32=C:\Program Files (x86)\Intel\oneAPI\vtune\2024.2\bin32\runtime\ittnotify_collector.dll
INTEL_LIBITTNOTIFY64=C:\Program Files (x86)\Intel\oneAPI\vtune\2024.2\bin64\runtime\ittnotify_collector.dll
INTEL_JIT_PROFILER32=C:\Program Files (x86)\Intel\oneAPI\vtune\2024.2\bin32\runtime\ittnotify_collector.dll
INTEL_JIT_PROFILER64=C:\Program Files (x86)\Intel\oneAPI\vtune\2024.2\bin64\runtime\ittnotify_collector.dll

 

 

With these environment variables set (especially INTEL_JIT_PROFILER64), on attach to process, iJIT_IsProfilingActive retuns iJIT_SAMPLING_ON as expected. However, 

 

iJIT_NotifyEvent(iJVM_EVENT_TYPE_METHOD_LOAD_FINISHED,(void*)&data);

 

Does not return 0 (the value returned when we're in launch mode), but some value similar to an address. And VTune does not have dynamic code information (e.g. hotspots mode, bottom-up, no function listed for generated code, probably in [Outside any known module]).

 

It's very confusing and greatly impact what we're able to do with VTune, as most of the runtime code are dynamic generated. Some team members say it worked before - I don't know when, but VTune support code are added in 2023 in the project.

 

I would like to know if I'm missing something? Or is it a VTune bug? I can provide more information if needed.

Labels (1)
30 Replies
ymao
Novice
938 Views

For me, iJIT_IsProfilingActive also returns correct value if I have INTEL_JIT_PROFILER64 and without VTune. However, if VTune is attached before calling iJIT_IsProfilingActive, iJIT_IsProfilingActive returns iJIT_NOTHING_RUNNING.

Also, attach VTune after calling iJIT_IsProfilingActive does not work either (which is the original discussion of this thread).

Yes, jit code executes only after step 2.5.

0 Kudos
yuzhang3_intel
Moderator
923 Views

If you set INTEL_JIT_PROFILER64 before running the application/process?

If I don't set INTEL_JIT_PROFILER64  before the process runs, the iJIT_IsProfilingActive returns iJIT_NOTHING_RUNNING, even VTune launches.

If I set INTEL_JIT_PROFILER64 before the process runs, the iJIT_IsProfilingActive returns iJIT_SAMPLING_ON regardless of whether VTune launches.

0 Kudos
ymao
Novice
911 Views

Yes, I have INTEL_JIT_PROFILER64 (step 1) before running the process.

I have the same behavior as yours if I don't set INTEL_JIT_PROFILER64  before the process runs.

If I set INTEL_JIT_PROFILER64 before the process runs (as in all my previous tests), of cause before running any generated code, there is different situations:

- if VTune is not attached: iJIT_IsProfilingActive returns iJIT_SAMPLING_ON, iJIT_NotifyEvent returns 1

- if VTune is attached: iJIT_IsProfilingActive returns iJIT_NOTHING_RUNNING, iJIT_NotifyEvent returns 0

- if I call iJIT_IsProfilingActive once (return iJIT_SAMPLING_ON), then I attach VTune, then I re-call iJIT_IsProfilingActive / iJIT_NotifyEvent: iJIT_IsProfilingActive returns iJIT_SAMPLING_ON, iJIT_NotifyEvent returns 0

However, VTune does not detect the generated code in the above situations either.

0 Kudos
yuzhang3_intel
Moderator
875 Views

I provide a simple jit profiling code. Please feel free to refer.

This is just a reminder that your dynamic code needs to run enough time; otherwise, VTune can't collect data.

 

 

0 Kudos
ymao
Novice
867 Views

Thank you for the example. But in the code profile.sh I can only see you launch ./memfunc with VTune instead of attach? (i.e. not with -target-process / -target-pid )

0 Kudos
yuzhang3_intel
Moderator
865 Views

You can change the command line to the following, it can also work. I just verified, the same result is observed.

$ vtune -collect hotspots -knob sampling-mode=hw -knob enable-stack-collection=true --target-pid xxxxx

 

 

Test steps:

1. set INTEL_JIT_PROFILER64 and run the memfunc binary in one console window

2. run VTune command line above in another window

3. click 'enter' in the first console window, the binary will continue

4. the second window will show the result, and you can see  [Dynamic code] information below.

 

Top Hotspots
Function Module CPU Time % of CPU Time(%)
------------- -------------- -------- ----------------
rand libc.so.6 86.458s 94.5%
main memfunc 2.219s 2.4%
func@0x402070 memfunc 1.779s 1.9%
JITed [Dynamic code] 0.516s 0.6%

0 Kudos
ymao
Novice
859 Views

Thank you a lot for you help! I just find out that I was using User-mode sampling (which does not work); and when configuring as yours, i.e. using Hardware-event based sampling + collect stacks, it works and I can now see my dynamic functions!

As VTune launch seems to works in user-mode I didn't realized that it could make differences (is that a bug or it is intended?).

0 Kudos
yuzhang3_intel
Moderator
856 Views

Both sw and hw sampling modes should work for the case, you can see my test result, I can get dynamic code information.

See if your dynamic code runs enough time; otherwise, VTune may not collect data.

 

Console window 1:

yuzhang3@yuzhang3-10710:~/workspace/isvc_jira_ips/jit$ ./memfunc
Any key press here

64bit linux
JIT profiling

 

Console window 2:

yuzhang3@yuzhang3-10710:~/workspace/isvc_jira_ips/jit$ vtune -collect hotspots -knob sampling-mode=sw -knob enable-stack-collection=true --target-pid 2105436
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /home/yuzhang3/workspace/isvc_jira_ips/jit/r006hs -command stop.
vtune: Collection detached.
vtune: Collection stopped.
vtune: Using result path `/home/yuzhang3/workspace/isvc_jira_ips/jit/r006hs'
vtune: Executing actions 75 % Generating a report Elapsed Time: 96.031s
CPU Time: 91.440s
Effective Time: 91.440s
Spin Time: 0s
Overhead Time: 0s
Total Thread Count: 1
Paused Time: 0s

Top Hotspots
Function Module CPU Time % of CPU Time(%)
------------- -------------- -------- ----------------
rand libc.so.6 86.352s 94.4%
main memfunc 2.472s 2.7%
func@0x402070 memfunc 1.576s 1.7%
JITed [Dynamic code] 0.680s 0.7%
A::A memfunc 0.360s 0.4%

0 Kudos
ymao
Novice
846 Views

I'm on Windows and I didn't have the environment to test your example yet, but reading your console result it seems to work fine. I don't know what problem it is on my side, I just know that my dynamic code occupied enough time to be displayed in launch + user mode sampling, or in attach + hardware based sampling (the first ones appeared in 20th-30th in the list by CPU time). Anyway, I think use hw sampling is a good enough solution for me now.

0 Kudos
yuzhang3_intel
Moderator
834 Views

Sure.  I am glad this solution is helpful to you.

 

0 Kudos
Reply