- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi guys,
I'm on Centos 7.5 and everything seemingly works except for this system service:
$ sudo journalctl -lf -o cat -u hwloc-dump-hwdata.service
Starting Dump hardware topology and locality information to /var/run/hwloc...
Couldn't find any KNL information.
Dumping KNL SMBIOS Memory-Side Cache information:
hwloc-dump-hwdata.service: main process exited, code=exited, status=1/FAILURE
Failed to start Dump hardware topology and locality information to /var/run/hwloc.
Unit hwloc-dump-hwdata.service entered failed state.
hwloc-dump-hwdata.service failed.
A while ago something broke, with updates I presume. Would have any advice on how to rectify this?
many thanks, L.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
that service "just works" on my KNL box.
What happens if you run the 'hwloc-dump-hwdata' as root ? does it return any KNL info?
If not, then perhaps the DMI bios info is not decoded properly. Post the output of
sudo strace /usr/sbin/hwloc-dump-hwdata 2>&1 | grep ^open
which on my box returns:
[...] open("/proc/self/status", O_RDONLY) = 3 openat(AT_FDCWD, "/sys/devices/system/node", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 open("/sys/devices/system/node/node0/meminfo", O_RDONLY) = 4 open("/sys/devices/system/node/node1/meminfo", O_RDONLY) = 4 open("/sys/devices/system/node/node2/meminfo", O_RDONLY) = 4 open("/sys/devices/system/node/node3/meminfo", O_RDONLY) = 4 open("/sys/devices/system/node/node4/meminfo", O_RDONLY) = 4 open("/sys/devices/system/node/node5/meminfo", O_RDONLY) = 4 open("/sys/devices/system/node/node6/meminfo", O_RDONLY) = 4 open("/sys/devices/system/node/node7/meminfo", O_RDONLY) = 4 openat(AT_FDCWD, "/sys/devices/system/cpu", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 open("/proc/self/status", O_RDONLY) = 3 openat(AT_FDCWD, "///sys/firmware/dmi/entries", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 open("///sys/firmware/dmi/entries/14-0/raw", O_RDONLY) = 4 open("///sys/firmware/dmi/entries/160-0/raw", O_RDONLY) = 4 open("///sys/firmware/dmi/entries/161-0/raw", O_RDONLY) = 4 open("/var/run/hwloc/knl_memoryside_cache", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That might be a problem, the BIOS of SYS-2027GR-TRT2
Inasmuch as Supermicro make really nice hardware they for reasons I cannot grasp, neglect their older products in terms of security. Which should be a MUST-NOT in days of spetre-meltdown. I've been waiting for a year, and still waiting, after I have exchanged emails with their tech support, for BIOS with fixes for those problems.
I see:
open("/proc/self/status", O_RDONLY) = 3
openat(AT_FDCWD, "/sys/devices/system/node", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
open("/sys/devices/system/node/node0/meminfo", O_RDONLY) = 4
open("/sys/devices/system/node/node1/meminfo", O_RDONLY) = 4
openat(AT_FDCWD, "/sys/devices/system/cpu", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
open("/proc/self/status", O_RDONLY) = 3
openat(AT_FDCWD, "///sys/firmware/dmi/entries", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
errrrrr, a quick search on SYS-2027GR-TRT2 gave me this link: https://www.supermicro.com/products/system/2u/2027/sys-2027gr-trt2.cfm
which lists the mobo as an LGA2011 mobo, suitable for Xeon E5's only. Thus, there is Xeon Phi (KNL) on the mobo thus the "hwloc-dump-hwdata" has nothing to report.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Seem like (for you) I have to quote myself - A while ago something broke... - which means it was not a problem some time ago.
A quick (thorough) search shows that it's GPU/Xeon Phi dedicated system. In mine I have Xeon Phis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please post the output of "lspci -v" - my suspicion is that your box has Xeon Phi coprocessors; the command you mention is used on KNL (Xeon Phi x200) processors only.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
But I just said I have Xeon Phis, more specifically:
83:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 31S1 (rev 11)
84:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 31S1 (rev 11)
I do not particularly need that command. I'm not sure if I need it at all. It just got installed with MPSS stack. I'm happy to remove it if it's no harm.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
systemctl disable hwloc-dump-hwdata.serviceHTH, JJK
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page