- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello
I am on node s017-n001. This is quad-GPU node where I am using oneAPI L0 for hwloc development. My code has been working fine on other nodes, but it fails in the very first L0 API call on this node: zeInit() returns error ZE_RESULT_ERROR_UNINITIALIZED (0x78000001). The documentation says it means the driver is not initialized but I am not sure if that means the kernel driver (i915 module is loaded) or some user-space L0 driver.
$ dpkg -l | grep level-zero | cut -c5-70
intel-level-zero-gpu 1.3.24420+i504~u20.04
level-zero 1.8.5+i504~u20.04
On a working node, packages are i815 instead i504, not sure if it matters.
$ dpkg -l | grep level-zero | cut -c5-70
intel-level-zero-gpu 1.3.24055+i815~u20.04
level-zero 1.8.5+i815~u20.04
level-zero-dev 1.8.5+i815~u20.04
By the way, level-zero-dev is missing on s017-n001. I reported the issue in the past and was told that it's only available on GPU-nodes, which should be the case here.
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting in Intel Communities.
We are able to reproduce the issue on the node s017-n001 and have informed the concerned team regarding this.
Could you please let us know the node numbers in which you are able to run the same successfully?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello
It works on all non-NDA nodes with GPUs I could test recently, this includes s019-n012, s001-n157, s001-n141 this morning.
Brice
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Sorry to inform you that the node s017-n001 is work in progress and it will not be accessible.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Hope your query got clarified. Could you please confirm whether we can close the case?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello. I don't see where anything got clarified. Both s017n001 and n002 are both "state = down,offline" since I reported this issue. I know some oneAPI developers are looking at it but I am not aware of any fix yet. This issue is certainly not resolved yet.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We are working on this internally and will share you the updates.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi.
Please be informed that quad nodes will become available early Q1 2023.
Hope this will clarify your query.
Please let us know if we can close this case.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for confirming. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.
Thanks
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page