- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey,
vtune just hangs without any output. It does not matter what I want to test.
for example:
vtune -collect hotspot — hostname
and even
vtune-self-checker.sh
does not provide any output whatsoever. We have a cluster running with one login node and 36 compute nodes. Everything works fine on the head-node, nothing on the compute nodes. The whole openAPI HPC and Base toolkits are installed on a shared filesystem on the headnode.
Any ideas what to do or how to get some output?
vtune Version: Intel(R) VTune(TM) Profiler 2023.0.0 (build 624757) Command Line Tool
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The only way to quit this vtune process is with CTRL+C. But there seems to be some processes running in the backround even then:
$ ps aux | grep /opt/intel
xy 609973 99.2 0.4 677552 396120 pts/0 Sl 10:50 1:52 /opt/intel/oneapi/vtune/2023.0.0/bin64/amplxe-runss --context-value-list
xy 609986 99.9 0.3 612024 324664 pts/0 Sl 10:50 1:29 /opt/intel/oneapi/vtune/2023.0.0/bin64/amplxe-runss --ui-output-format xml --ui-output-fd 5 --context-value-list
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting in Intel Communities.
Could you please share the following details:
- OS and Hardware details
- If your operating system is linux please share kernel details:
Please use the below command to get kernel details :
uname -a
3. Exact steps you followed and sample reproducer to reproduce the same from our end
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Our Head Node:
Dell Poweredge R740
Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
96GB DDr4
Linux hydra-head 5.10.0-18-amd64 #1 SMP Debian 5.10.140-1 (2022-09-02) x86_64 GNU/Linux
And our computenodes:
36x Dell Poweredge C6420.
2x Intel(R) Xeon(R) Gold 6130F CPU @ 2.10GHz (Hyperthreading disabled)
96GB DDr4
Linux hydra01 5.10.0-18-amd64 #1 SMP Debian 5.10.140-1 (2022-09-02) x86_64 GNU/Linux
Some more information:
We installed intel oneapi on our headnode on /opt/intel/oneapi. This filesystem is shared with our computenodes. They only have readonly access, I tried changing that to readwrite but it did not change anything. Do I maybe have to run some local installations on the computenodes?
There are not many steps to follow:
1. Install Intel vtune with OneApi (we have everything availiable installed, hpc_toolkit and base_toolkit)
2. Execute any vtune test on the head (including vtune-self-checker.sh), works (for example vtune -collect hotspot -- hostname)
3. Execute any vtune test on the node: does not work
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found some extra information in the kernel logs:
vtune[78877]: segfault at 3 ip 0000000000000003 sp 00007efdace3e1d8 error 14 in vtune[55d496e00000+50000]
Code: Unable to access opcode bytes at RIP 0xffffffffffffffd9.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We are checking on this internally. We will get back to you with an update.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We are sorry for the delay. Could you please let us know whether the drivers installed on all the compute nodes?
If not, please install and check whether the issue still persists.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The drivers seem to be loaded:
root@hydra01:/opt/intel/oneapi/vtune/latest/sepdk/src# ./insmod-sep -q pax driver is loaded and owned by group "vtune" with file permissions "660". socperf3 driver is loaded and owned by group "vtune" with file permissions "660". sep5 driver is loaded and owned by group "vtune" with file permissions "660". socwatch driver is loaded and owned by group "vtune" with file permissions "660". vtsspp driver is loaded and owned by group "vtune" with file permissions "660".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Please do the following on the failing node:
rm -rf /tmp/amplxe*
<run the failing scenario>
tar zcvf logs.tgz /tmp/amplxe-log-*
Please share the logs.tgz with us.
If you face any issue, please let us know.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have attached the requested logs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. Could you please share the logs.tgz which we mentioned in the last response.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for sharing the log file. Unfortunately, there is not much in the log files. It looks like amplxe-runss is having serious trouble on this system, but we will need extended logging and/or a crashdump to figure where it is stuck. We can start with setting AMPLXE_LOG_LEVEL=trace in the environment and launching
amplxe-runss -help
on the node for a very basic smoke test. And if this won't hang, please proceed with
amplxe-runss -context-value-list
and share the logs again.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. Could you please share the log file which we mentioned in the last response.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here are the new logfiles:
BTW: "amplxe-runss -context-value-list" hangs and cannot be canceled with Ctr+C
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.
Thanks

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page