Dear Intel VTune Support Team,
I am learning to use vtune_profiler_2020.0.0.605129 on Arch Linux (kernel 5.3.13) and the CPU based analyses work on my machine.
But I have not managed to run per-program GPU analyses. (System wide GPU profiling seems to work)
E.g. when issuing the following command to profile the program glxspheres64
TPSS_DEBUG=1 /opt/intel/vtune_profiler_2020.0.0.605129/bin64/vtune -collect graphics-rendering -app-working-dir /usr/bin -- /usr/bin/env MESA_GLSL_CACHE_DISABLE=true /usr/bin/glxspheres64
I get the following output:
log4cplus:ERROR Unable to open file: ./tpss-2020.03.14-10h16m40s.405792.log vtune: Warning: The option to analyze all processes running on the system is enabled for this analysis type by default. vtune: Warning: Ftrace 'igfx-preempt' events cannot be collected on this platform. vtune: Warning: To enable hardware event-base sampling, VTune Profiler has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes. vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /home/christianl/intel/amplxe/projects/test/r006gr -command stop. strace: Process 405798 attached strace: Process 405798 detached strace: Process 405798 attached Polygons in scene: 62464 (61 spheres * 1024 polys/spheres) vcs/tpss2/tpss/src/tpss/runtime/linux/exe/tpss_deepbind.c:237 tpss_deepbind_notify_on_pthread_loaded: Assertion '((tpss_pthread_key_create_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_key_create)]))->trampoline)) != ((void *)0) && ((tpss_pthread_setspecific_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_setspecific)]))->trampoline)) != ((void *)0) && ((tpss_pthread_getspecific_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_getspecific)]))->trampoline)) != ((void *)0) && ((tpss_pthread_self_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_self)]))->trampoline)) != ((void *)0) && ((tpss_pthread_getattr_np_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_getattr_np)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_getstack_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_getstack)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_getstacksize_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_getstacksize)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_setstack_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_setstack)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_setstacksize_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_setstacksize)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_destroy_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_destroy)]))->trampoline)) != ((void *)0) && ((tpss__pthread_cleanup_push_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi__pthread_cleanup_push)]))->trampoline)) != ((void *)0) && ((tpss__pthread_cleanup_pop_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi__pthread_cleanup_pop)]))->trampoline)) != ((void *)0)' failed. strace: Process 405798 detached vtune: Collection stopped. ... (output continues) ...
I get similar tpss_deepbind_notify_on_pthread_loaded assertions for other rendering applications.
I am using the Mesa graphics driver:
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) UHD Graphics 620 (Kabylake GT2)
Do you have any advice how to resolve this issue?
I have attached the resulting analysis file for you.
The same error results without TPSS_DEBUG=1 and without MESA_GLSL_CACHE_DISABLE=true.
regards,
Christian
Hello Jose,
could you resolve the problem?
I have the same error message using vtune_profiler_2020.0.0.605129 and CPU hotspot analysis.
Best,
Johannes
Hello Jose,
the same with amplxe-2018.0.2.525261 and hotspot analyse, but advanced hotspot is ok.
on Linux rsys-pc 5.5.16-1-MANJARO #1 SMP PREEMPT Wed Apr 8 10:07:00 UTC 2020 x86_64 GNU/Linux
$ ./amplxe-gui
vcs/tpss2/tpss/src/tpss/runtime/linux/exe/tpss_deepbind.c:237 tpss_deepbind_notify_on_pthread_loaded: Assertion '((tpss_pthread_key_create_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_key_create)]))->trampoline)) != ((void *)0) && ((tpss_pthread_setspecific_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_setspecific)]))->trampoline)) != ((void *)0) && ((tpss_pthread_getspecific_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_getspecific)]))->trampoline)) != ((void *)0) && ((tpss_pthread_self_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_self)]))->trampoline)) != ((void *)0) && ((tpss_pthread_getattr_np_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_getattr_np)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_getstack_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_getstack)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_getstacksize_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_getstacksize)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_setstack_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_setstack)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_setstacksize_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_setstacksize)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_destroy_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_destroy)]))->trampoline)) != ((void *)0) && ((tpss__pthread_cleanup_push_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi__pthread_cleanup_push)]))->trampoline)) != ((void *)0) && ((tpss__pthread_cleanup_pop_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi__pthread_cleanup_pop)]))->trampoline)) != ((void *)0)' failed.
Hi all,
the issue happens due to pthread_self and pthread_attr_destroy symbols have been removed from libpthread library (completely moved to libc). It will be fixed in future releases of VTune.
Any updates on when this is going to be fixed? "Future releases" sounds pretty vague. Two releases? Dozen releases? Right now VTune is pretty useless. It just hangs after this assert.
My company is in the process of updating to Ubuntu 20.04 as the next LTS. We are encountering the same issue as listed above and can no longer use VTune to analyze our code.
Can we have an update on timelines? We are currently running vtune_profiler_2020_update1 under our paid license.
Any updates to timelines? Our company has transitioned to Ubuntu 20.04 as the next LTS and we can no longer use VTune to analyze our code using our paid subscription.
I also see this with Intel Advisor (advisor_2020.1.0.605410/) on Ubuntu 20.04.
Hi,
I found that the latest version of glibc containing the pthread_self and pthread_attr_destroy symbols is version 2.30-3 (at least under the arch linux versioning).
This version still builds on a current arch linux and downgrading to is resolved the error, as far as I have tested (at least on the Intel advisor software).
Maybe this workaround helps someone until it is fixed upstream.
regards,
Christian
Is there timeline on this yet? I am also experiencing this bug on Ubuntu 20.04.
Hi all,
I experience the same problem, Ubuntu 20, advisor2020 XE.
$ advixe-gui
Gtk-Message: 11:32:35.800: Failed to load module "canberra-gtk-module"
vcs/tpss2/tpss/src/tpss/runtime/linux/exe/tpss_deepbind.c:237 tpss_deepbind_notify_on_pthread_loaded: Assertion '((tpss_pthread_key_create_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_key_create)]))->trampoline)) != ((void *)0) && ((tpss_pthread_setspecific_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_setspecific)]))->trampoline)) != ((void *)0) && ((tpss_pthread_getspecific_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_getspecific)]))->trampoline)) != ((void *)0) && ((tpss_pthread_self_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_self)]))->trampoline)) != ((void *)0) && ((tpss_pthread_getattr_np_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_getattr_np)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_getstack_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_getstack)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_getstacksize_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_getstacksize)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_setstack_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_setstack)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_setstacksize_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_setstacksize)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_destroy_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_destroy)]))->trampoline)) != ((void *)0) && ((tpss__pthread_cleanup_push_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi__pthread_cleanup_push)]))->trampoline)) != ((void *)0) && ((tpss__pthread_cleanup_pop_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi__pthread_cleanup_pop)]))->trampoline)) != ((void *)0)' failed.
Are there any updates on this from the Intel Team?
Thank you in advance,
George Bisbas
> I found that the latest version of glibc containing the pthread_self and pthread_attr_destroy symbols is version 2.30-3 (at least under the arch linux versioning).
Hmm... according to the Intel guy the problem is the opposite: pthread_self and pthread_attr_destroy are moved from pthread to libc. So I guess I have to update them in pair? What versions should I use for both?
I tried to compile glibc 2.30 and run vtune with it, but it just crashed. Besides libc vtune uses 95 other shared libraries. All of them of course depend on libc. In order to downgrade glibc, I have to recompile all of them plus their dependencies. This is not feasible. For Intel on other hand it's just a matter of rebuilding VTune, i.e. typing make in command line and pressing enter.
*Any* ETA or workaround for this?
The fix was included in VTune 2021 Beta Update 7 (it is a part of oneAPI Beta package) and in VTune 2020 Update 2.
If you don't have a possibility to download it you can downgrade your glibc or use HW-based analyses as possible workarounds.
Dear Vladimir, dear Intel,
thanks a lot for that. Hopefully will check soon.
Thanks again.
--George
Hi again, I could not find online
VTune 2020 Update 2. Any link? Has the release been published?
Best,
George
Hi,
you can download 2020 update 2 from here https://software.intel.com/content/www/us/en/develop/tools/vtune-profiler/choose-download.html
and OneAPI toolkit from here https://software.intel.com/content/www/us/en/develop/tools/oneapi.html#oneapi-toolkits
Answering on your question, yes, Update 2 was recently published. OneAPI Beta Update 07 is available from the middle of June.
For more complete information about compiler optimizations, see our Optimization Notice.