<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Error starting containers using habana-container-runtime in Intel® Gaudi® AI Accelerator</title>
    <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676937#M64</link>
    <description>&lt;P&gt;Thanks. Same as &lt;A title="1663854" href="https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Insufficient-Slots-for-MPIJob-with-2-Worker-Pods-and-2-Gaudi/m-p/1663854/thread-id/28" target="_blank" rel="noopener"&gt;1663854&lt;/A&gt;,&amp;nbsp;&lt;SPAN&gt;I don't have access to Habana's Jira, so please keep me posted if there's any update on the issue.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 21 Mar 2025 21:58:25 GMT</pubDate>
    <dc:creator>Gera_Dmz</dc:creator>
    <dc:date>2025-03-21T21:58:25Z</dc:date>
    <item>
      <title>Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1663853#M27</link>
      <description>&lt;P&gt;&lt;SPAN&gt;I am experiencing an issue where I am unable to access Gaudi accelerators when creating a Docker container using the Habana runtime.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Steps to reproduce:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;A href="https://docs.habana.ai/en/latest/Installation_Guide/Driver_Installation.html#install-driver-and-software" target="_blank" rel="nofollow noopener"&gt;Installed Gaudi drivers &amp;amp; Software&lt;/A&gt;.&lt;/LI&gt;&lt;LI&gt;&lt;A href="https://github.com/HabanaAI/habana-container-runtime?tab=readme-ov-file#build-binaries" target="_blank" rel="noopener"&gt;Built binaries&lt;/A&gt;.&lt;/LI&gt;&lt;LI&gt;Configured both&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://github.com/HabanaAI/habana-container-runtime?tab=readme-ov-file#daemon-configuration-file" target="_blank" rel="noopener"&gt;/etc/docker/daemon.json&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&amp;amp;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://github.com/HabanaAI/habana-container-runtime?tab=readme-ov-file#containerd-configuration-file" target="_blank" rel="noopener"&gt;/etc/containerd/config.toml&lt;/A&gt;.&lt;/LI&gt;&lt;LI&gt;&lt;EM&gt;Ran&amp;nbsp;docker run --rm --runtime=habana -e HABANA_VISIBLE_DEVICES=all ubuntu:22.04 /bin/bash -c "ls /dev/accel/*"&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;and got:&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: exposing interfaces: failed creating temporary link on host: invalid argument
exit status 1: unknown.​&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Tried&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.19.1/ubuntu22.04/habanalabs/pytorch-installer-2.5.1:latest&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;but also got:&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: exposing interfaces: failed creating temporary link on host: invalid argument
exit status 1: unknown.​&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Removing&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;-e HABANA_VISIBLE_DEVICES=all&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;I'm able to exec into the container, but the accelerators are not visible inside the container:&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;# hl-smi
habanalabs driver is not loaded or no AIPs available, aborting...
# ls /dev/accel
ls: cannot access '/dev/accel': No such file or directory​&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;OS&lt;/STRONG&gt;&lt;BR /&gt;Ubuntu 22.04.4 LTS&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Kernel Version&lt;/STRONG&gt;&lt;BR /&gt;5.15.0-117-generic&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Container Runtime Type/Version&lt;/STRONG&gt;&lt;BR /&gt;1.19.1&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;K8s Flavor/Version(e.g. K8s, OCP, Rancher, GKE, EKS)&lt;/STRONG&gt;&lt;BR /&gt;Docker version 27.5.0&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Extra logs and files&lt;/STRONG&gt;&lt;BR /&gt;From the host machine:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ hl-smi
+-----------------------------------------------------------------------------+
| HL-SMI Version: hl-1.19.1-fw-57.2.2.0 |
| Driver Version: 1.19.1-6f47ddd |
|-------------------------------+----------------------+----------------------+
| AIP Name Persistence-M| Bus-Id Disp.A | Volatile Uncor-Events|
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | AIP-Util Compute M. |
|===============================+======================+======================|
| 0 HL-225 N/A | 0000:33:00.0 N/A | 0 |
| N/A 24C N/A 88W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------+----------------------+----------------------+
| 1 HL-225 N/A | 0000:9a:00.0 N/A | 0 |
| N/A 25C N/A 92W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------+----------------------+----------------------+
| 2 HL-225 N/A | 0000:34:00.0 N/A | 0 |
| N/A 26C N/A 76W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------+----------------------+----------------------+
| 3 HL-225 N/A | 0000:9b:00.0 N/A | 0 |
| N/A 27C N/A 102W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------+----------------------+----------------------+
| 4 HL-225 N/A | 0000:4d:00.0 N/A | 0 |
| N/A 27C N/A 90W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------+----------------------+----------------------+
| 5 HL-225 N/A | 0000:4e:00.0 N/A | 0 |
| N/A 25C N/A 82W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------+----------------------+----------------------+
| 6 HL-225 N/A | 0000:b4:00.0 N/A | 0 |
| N/A 25C N/A 65W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------+----------------------+----------------------+
| 7 HL-225 N/A | 0000:b3:00.0 N/A | 0 |
| N/A 27C N/A 84W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------+----------------------+----------------------+
| Compute Processes: AIP Memory |
| AIP PID Type Process name Usage |
|=============================================================================|
| 0 N/A N/A N/A N/A |
| 1 N/A N/A N/A N/A |
| 2 N/A N/A N/A N/A |
| 3 N/A N/A N/A N/A |
| 4 N/A N/A N/A N/A |
| 5 N/A N/A N/A N/A |
| 6 N/A N/A N/A N/A |
| 7 N/A N/A N/A N/A |
+=============================================================================+

$ tail -n 1 /var/log/habana-container-runtime.log
{"time":"2025-01-22T22:10:20.545416796Z","level":"INFO","msg":"file does not exist on host: /etc/habanalabs/gaudinet.json"}

$ tail -n 1 /var/log/habana-container-hook.log
{"time":"2025-01-22T22:10:20.569909471Z","level":"ERROR","msg":"exposing interfaces: failed creating temporary link on host: invalid argument"}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Feb 2025 22:22:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1663853#M27</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-02-06T22:22:25Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664692#M29</link>
      <description>&lt;P&gt;This looks like an issue with your installation or configuration of the habana-container-runtime. On the system, what is the output of:&lt;/P&gt;&lt;P&gt;`dpkg -l | grep habanalabs-container-runtime`&lt;/P&gt;&lt;P&gt;I am particularly interested in the version.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Also, If you are using docker, remove the containerd toml file and check your docker configuration file again. The&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/etc/docker/daemon.json&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;file should look similar to this:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;
   &lt;SPAN class=""&gt;"default-runtime"&lt;/SPAN&gt;&lt;SPAN class=""&gt;:&lt;/SPAN&gt; &lt;SPAN class=""&gt;"habana"&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;
   &lt;SPAN class=""&gt;"runtimes"&lt;/SPAN&gt;&lt;SPAN class=""&gt;:&lt;/SPAN&gt; &lt;SPAN class=""&gt;{&lt;/SPAN&gt;
      &lt;SPAN class=""&gt;"habana"&lt;/SPAN&gt;&lt;SPAN class=""&gt;:&lt;/SPAN&gt; &lt;SPAN class=""&gt;{&lt;/SPAN&gt;
         &lt;SPAN class=""&gt;"path"&lt;/SPAN&gt;&lt;SPAN class=""&gt;:&lt;/SPAN&gt; &lt;SPAN class=""&gt;"/usr/bin/habana-container-runtime"&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;
         &lt;SPAN class=""&gt;"runtimeArgs"&lt;/SPAN&gt;&lt;SPAN class=""&gt;:&lt;/SPAN&gt; &lt;SPAN class=""&gt;[]&lt;/SPAN&gt;
      &lt;SPAN class=""&gt;}&lt;/SPAN&gt;
   &lt;SPAN class=""&gt;}&lt;/SPAN&gt;
&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;Then restart the docker service:&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class=""&gt;sudo&lt;/SPAN&gt; &lt;SPAN class=""&gt;systemctl&lt;/SPAN&gt; &lt;SPAN class=""&gt;restart&lt;/SPAN&gt; &lt;SPAN class=""&gt;docker&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Feb 2025 18:39:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664692#M29</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-02-10T18:39:14Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664743#M31</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/167889"&gt;@James_Edwards&lt;/a&gt;&amp;nbsp;for replying. The installed version is:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ dpkg -l | grep habanalabs-container-runtime
ii  habanalabs-container-runtime              1.19.1-26                                   amd64        HABANA container runtime&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/etc/docker/daemon.json is as follows:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ cat /etc/docker/daemon.json
{"runtimes": {"habana": {"path": "/usr/bin/habana-container-runtime", "runtimeArgs": []}}, "default-runtime": "habana"}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Removing /etc/containerd/config.toml did help on being able to exec into the container using&amp;nbsp;&lt;EM&gt;&lt;SPAN class=""&gt;docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.19.1/ubuntu22.04/habanalabs/pytorch-installer-2.5.1:latest.&amp;nbsp;&lt;/SPAN&gt;&lt;/EM&gt;&lt;SPAN class=""&gt;But inside the container I'm still getting:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;# hl-smi
habanalabs driver is not loaded or no AIPs available, aborting...&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Feb 2025 21:59:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664743#M31</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-02-10T21:59:12Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664745#M32</link>
      <description>&lt;P&gt;1) Check the output of the &lt;STRONG&gt;habana-container-hook.log&lt;/STRONG&gt; and the &lt;STRONG&gt;habana-container-runtime.log&lt;/STRONG&gt; again to see if there are any other errors associated with the docker server. (By the way, the gaudinet.json file is &lt;STRONG&gt;not required&lt;/STRONG&gt; for single gaudi nodes, and that error is not important. The ERROR message in the habana-container-runtime.log was a problem, however).&lt;/P&gt;&lt;P&gt;2) Start the docker container with the -d option and then run 'docker logs &amp;lt;container id&amp;gt;' to see if there are any errors on container issues on startup.&lt;/P&gt;&lt;P&gt;3) On the host get the permissions on the Gaudi accelerator devices: 'ls -l /dev/accel'&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Feb 2025 22:32:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664745#M32</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-02-10T22:32:06Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664747#M34</link>
      <description>&lt;P&gt;1) Checking habana-container-hook.log&amp;nbsp;I see this new error:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ tail -n 1 /var/log/habana-container-hook.log
{"time":"2025-02-10T22:44:50.45307364Z","level":"INFO","msg":"device already exists in namespace. Host network used?"}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;2) I don't get any logs when using the detached option:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ docker run -d --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.19.1/ubuntu22.04/habanalabs/pytorch-installer-2.5.1:latest sleep infinity
b8ca5e99989bef171fde1355f9e40c78f77914c96184cfa52e3f877c95990981
$ docker ps
CONTAINER ID   IMAGE                                                                                       COMMAND            CREATED         STATUS         PORTS     NAMES
b8ca5e99989b   vault.habana.ai/gaudi-docker/1.19.1/ubuntu22.04/habanalabs/pytorch-installer-2.5.1:latest   "sleep infinity"   3 seconds ago   Up 3 seconds             romantic_williams
$ docker logs b8ca5e99989b&lt;/LI-CODE&gt;&lt;P&gt;3) The permissions are:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ ls -l /dev/accel/
total 0
crw-rw-rw- 1 root root 509,  0 Jan 14 23:23 accel0
crw-rw-rw- 1 root root 509,  2 Jan 14 23:23 accel1
crw-rw-rw- 1 root root 509,  4 Jan 14 23:23 accel2
crw-rw-rw- 1 root root 509,  6 Jan 14 23:23 accel3
crw-rw-rw- 1 root root 509,  8 Jan 14 23:23 accel4
crw-rw-rw- 1 root root 509, 10 Jan 14 23:23 accel5
crw-rw-rw- 1 root root 509, 12 Jan 14 23:23 accel6
crw-rw-rw- 1 root root 509, 14 Jan 14 23:23 accel7
crw-rw-rw- 1 root root 509,  1 Jan 14 23:23 accel_controlD0
crw-rw-rw- 1 root root 509,  3 Jan 14 23:23 accel_controlD1
crw-rw-rw- 1 root root 509,  5 Jan 14 23:23 accel_controlD2
crw-rw-rw- 1 root root 509,  7 Jan 14 23:23 accel_controlD3
crw-rw-rw- 1 root root 509,  9 Jan 14 23:23 accel_controlD4
crw-rw-rw- 1 root root 509, 11 Jan 14 23:23 accel_controlD5
crw-rw-rw- 1 root root 509, 13 Jan 14 23:23 accel_controlD6
crw-rw-rw- 1 root root 509, 15 Jan 14 23:23 accel_controlD7&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Feb 2025 22:49:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664747#M34</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-02-10T22:49:41Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664777#M36</link>
      <description>&lt;P&gt;The device permissions all seem good. I was also able to run your docker command and get all devices (I used the 1.19.0 version of the container runtime). This still seems like an issue with the configuration of the container runtime. What does this command say:&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;docker inspect &amp;lt;container_id&amp;gt; | grep Runtime&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;If you don't get a string that says "Runtime": "habana" there is a configuration error still. Post the entire output of docker inspect so I can look at it otherwise.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 11 Feb 2025 01:21:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1664777#M36</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-02-11T01:21:22Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1665036#M37</link>
      <description>&lt;P&gt;I do see&amp;nbsp;&lt;SPAN&gt;"Runtime": "habana":&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ docker ps
CONTAINER ID   IMAGE                                                                                       COMMAND            CREATED              STATUS              PORTS     NAMES
eb0bf40f6b1d   vault.habana.ai/gaudi-docker/1.19.1/ubuntu22.04/habanalabs/pytorch-installer-2.5.1:latest   "sleep infinity"   About a minute ago   Up About a minute             boring_lehmann
$ docker inspect eb0bf40f6b1d | grep Runtime
            "Runtime": "habana",
            "CpuRealtimeRuntime": 0,&lt;/LI-CODE&gt;&lt;P&gt;Inspecting the container:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ docker inspect eb0bf40f6b1d
[
    {
        "Id": "eb0bf40f6b1d0392e38e9e1542650fab9df37e851d4d74388dfe2d904fcabdce",
        "Created": "2025-02-11T16:06:30.690094085Z",
        "Path": "sleep",
        "Args": [
            "infinity"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 31171,
            "ExitCode": 0,11:30 AM 2/11/2025
            "Error": "",
            "StartedAt": "2025-02-11T16:06:33.601305388Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:1d0c9dbfacfdffb8ed752b9f56519c3290645bb4ec81542a7256feb18f65c9a3",
        "ResolvConfPath": "/var/lib/docker/containers/eb0bf40f6b1d0392e38e9e1542650fab9df37e851d4d74388dfe2d904fcabdce/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/eb0bf40f6b1d0392e38e9e1542650fab9df37e851d4d74388dfe2d904fcabdce/hostname",
        "HostsPath": "/var/lib/docker/containers/eb0bf40f6b1d0392e38e9e1542650fab9df37e851d4d74388dfe2d904fcabdce/hosts",
        "LogPath": "/var/lib/docker/containers/eb0bf40f6b1d0392e38e9e1542650fab9df37e851d4d74388dfe2d904fcabdce/eb0bf40f6b1d0392e38e9e1542650fab9df37e851d4d74388dfe2d904fcabdce-json.log",
        "Name": "/boring_lehmann",
        "RestartCount": 0,
        "Driver": "overlay2",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "docker-default",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": null,
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "host",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "ConsoleSize": [
                49,
                189
            ],
            "CapAdd": [
                "sys_nice"
            ],
            "CapDrop": null,
            "CgroupnsMode": "private",
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "host",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": [
                "label=disable"
            ],
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "habana",
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": [],
            "BlkioDeviceReadBps": [],
            "BlkioDeviceWriteBps": [],
            "BlkioDeviceReadIOps": [],
            "BlkioDeviceWriteIOps": [],
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": null,
            "PidsLimit": null,
            "Ulimits": [],
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware",
                "/sys/devices/virtual/powercap"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/99f176c4457ecbc3eb35beb6ae2e7c8c98c5897c6a74ed95d4a617f239051f84-init/diff:/var/lib/docker/overlay2/84a1dd5c7734190f9d1d61036a16e11c84e61574e09e64eec206d0c32b48054a/diff:/var/lib/docker/overlay2/f927604ccd4f437148a7c90d3e4e6f42266d84bd4ef415e807e18ccaf67362d5/diff:/var/lib/docker/overlay2/4ef1a9882c291c23760de9b0a74e196e230f1f7c37f61ec38db07539f4bd7f78/diff:/var/lib/docker/overlay2/720dd7a239d971dfc9243be25f090392ab5d48825897aef448d1b3a867c397d5/diff:/var/lib/docker/overlay2/65ff9290ebd144007a656959a1bbf2aa2c147fef0bc0f2af441bcd7077c53a18/diff:/var/lib/docker/overlay2/9b6f21580d4adc9da99557c895d6f271116421007d4a7a1c23d7f238ce41dd57/diff:/var/lib/docker/overlay2/3436c6314fa980e97b2f1fad0df95cb41caf626a7f776852405964dbc76d8d6a/diff:/var/lib/docker/overlay2/e426528d3b3300f3087fb6d1a59f72a243ed57a722f576b4f94f8574ef6e9e28/diff:/var/lib/docker/overlay2/169b6c543d8a5bd1ca87e26c1c2050f8e95635eb8e26122c97a6811cefa434d5/diff:/var/lib/docker/overlay2/1333522faba816e0b61ac33b484328ad8d8422c797c48470b3adabba890660eb/diff:/var/lib/docker/overlay2/d646b22eeb7590a13ed22a6fbb909ff702580e6e728609a597e451af0d5e03cf/diff:/var/lib/docker/overlay2/1062b70b19dff8afe284d195299e81826f0ce5f20004b42406320a05e48d9ca4/diff:/var/lib/docker/overlay2/93a7eb0237d95369d64caa402907dc4b16f82ccced8423bafbc9a8aa26e5eeda/diff:/var/lib/docker/overlay2/32a07cfc772d82b895a6fcd4019d007501d51f3bbc91ede16f3a7592a207e63e/diff",
                "MergedDir": "/var/lib/docker/overlay2/99f176c4457ecbc3eb35beb6ae2e7c8c98c5897c6a74ed95d4a617f239051f84/merged",
                "UpperDir": "/var/lib/docker/overlay2/99f176c4457ecbc3eb35beb6ae2e7c8c98c5897c6a74ed95d4a617f239051f84/diff",
                "WorkDir": "/var/lib/docker/overlay2/99f176c4457ecbc3eb35beb6ae2e7c8c98c5897c6a74ed95d4a617f239051f84/work"
            },
            "Name": "overlay2"
        },
        "Mounts": [],
        "Config": {
            "Hostname": "ng-nx6n7s4yyi-6f812",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "HABANA_VISIBLE_DEVICES=all",
                "OMPI_MCA_btl_vader_single_copy_mechanism=none",
                "PATH=/opt/habanalabs/libfabric-1.22.0/bin:/opt/amazon/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "DEBIAN_FRONTEND=noninteractive",
                "GC_KERNEL_PATH=/usr/lib/habanalabs/libtpc_kernels.so",
                "HABANA_LOGS=/var/log/habana_logs/",
                "OS_NUMBER=2204",
                "HABANA_SCAL_BIN_PATH=/opt/habanalabs/engines_fw",
                "HABANA_PLUGINS_LIB_PATH=/opt/habanalabs/habana_plugins",
                "PIP_NO_CACHE_DIR=on",
                "PIP_DEFAULT_TIMEOUT=1000",
                "PIP_DISABLE_PIP_VERSION_CHECK=1",
                "LIBFABRIC_VERSION=1.22.0",
                "LIBFABRIC_ROOT=/opt/habanalabs/libfabric-1.22.0",
                "MPI_ROOT=/opt/amazon/openmpi",
                "LD_LIBRARY_PATH=/opt/habanalabs/libfabric-1.22.0/lib:/opt/amazon/openmpi/lib:/usr/lib/habanalabs:",
                "OPAL_PREFIX=/opt/amazon/openmpi",
                "MPICC=/opt/amazon/openmpi/bin/mpicc",
                "RDMAV_FORK_SAFE=1",
                "FI_EFA_USE_DEVICE_RDMA=1",
                "RDMA_CORE_ROOT=/opt/habanalabs/rdma-core/src",
                "RDMA_CORE_LIB=/opt/habanalabs/rdma-core/src/build/lib",
                "PYTHONPATH=/root:/usr/lib/habanalabs/",
                "LD_PRELOAD=/lib/x86_64-linux-gnu/libtcmalloc.so.4",
                "TCMALLOC_LARGE_ALLOC_REPORT_THRESHOLD=7516192768"
            ],
            "Cmd": [
                "sleep",
                "infinity"
            ],
            "Image": "vault.habana.ai/gaudi-docker/1.19.1/ubuntu22.04/habanalabs/pytorch-installer-2.5.1:latest",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {
                "org.opencontainers.image.ref.name": "ubuntu",
                "org.opencontainers.image.version": "22.04"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "2663a081212073036fdafcefd1d0d214b1f4d0023adfc9c20f49e9ff3f4dbec4",
            "SandboxKey": "/var/run/docker/netns/default",
            "Ports": {},
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "host": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "MacAddress": "",
                    "DriverOpts": null,
                    "NetworkID": "528b711e870bfc399f943e2db9499cd9993d5441e6ac2bec38e54a4862f39af1",
                    "EndpointID": "ff60fa70e73a35564bb4f73bd2bc34c86c7d0466f27bd90f346aa2d969604538",
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "DNSNames": null
                }
            }
        }
    }
]&lt;/LI-CODE&gt;&lt;P&gt;Thanks again for the support.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Feb 2025 17:52:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1665036#M37</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-02-11T17:52:58Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1665051#M40</link>
      <description>&lt;P&gt;I have been unable to find anything wrong with the docker container configuration file you sent me; it is nearly identical to a working container I have executed. The only thing I can think of to do is to install the 1.19.1 habanalabs-container runtime and see if I can reproduce your issue. In the meantime, the only possible solution I can suggest is to install the new 1.19.2 Gaudi software stack (released yesterday) and see if that resolves your issue.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Feb 2025 19:34:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1665051#M40</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-02-11T19:34:00Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1665489#M42</link>
      <description>&lt;P&gt;Okey, it'd be great if you can get to replicate the error. From our side we can upgrade to 1.19.2.&lt;/P&gt;</description>
      <pubDate>Wed, 12 Feb 2025 19:34:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1665489#M42</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-02-12T19:34:16Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1667090#M47</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/167889"&gt;@James_Edwards&lt;/a&gt;&amp;nbsp;, we've upgraded Gaudi SW stack to 1.19.2. Unfortunately the behavior is the same, we're still getting:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;# hl-smi
habanalabs driver is not loaded or no AIPs available, aborting...&lt;/LI-CODE&gt;</description>
      <pubDate>Tue, 18 Feb 2025 00:00:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1667090#M47</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-02-18T00:00:52Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1667926#M49</link>
      <description>&lt;P&gt;I updated a system to the 1.19.2 version of Intel Gaudi software and configured the&amp;nbsp;habanalabs-container-runtime (at version1.19.2-32), as specified in the comments above. I started a docker container with the following command:&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.19.2/ubuntu24.04/habanalabs/pytorch-installer-2.5.1:latest&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;All devices were available in the container an hl-smi gave no errors. Basically, I was unable to reproduce the problem with the latest software.&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;Try executing the docker command with the&amp;nbsp;&lt;SPAN&gt;--privileged flag not using the habana runtime:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;docker run -it&amp;nbsp; --privileged&amp;nbsp; --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.19.2/ubuntu24.04/habanalabs/pytorch-installer-2.5.1:latest&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I know this is brute force, but it will tell us if the runtime is the issue.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Feb 2025 15:48:57 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1667926#M49</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-02-19T15:48:57Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1668063#M50</link>
      <description>&lt;P&gt;Interesting. Running:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;docker run -it  --privileged  --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.19.2/ubuntu24.04/habanalabs/pytorch-installer-2.5.1:latest&lt;/LI-CODE&gt;&lt;P&gt;I'm actually able to run hl-smi inside the container:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;# hl-smi
+-----------------------------------------------------------------------------+
| HL-SMI Version:                              hl-1.19.2-fw-57.2.4.0          |
| Driver Version:                                     1.19.2-ff37fea          |
|-------------------------------+----------------------+----------------------+
| AIP  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncor-Events|
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | AIP-Util  Compute M. |
|===============================+======================+======================|
|   0  HL-225              N/A  | 0000:33:00.0     N/A |                   0  |
| N/A   20C   N/A  86W /  600W  |   768MiB /  98304MiB |     0%           N/A |
|-------------------------------+----------------------+----------------------+
|   1  HL-225              N/A  | 0000:9a:00.0     N/A |                   0  |
| N/A   21C   N/A  92W /  600W  |   768MiB /  98304MiB |     0%           N/A |
|-------------------------------+----------------------+----------------------+
|   2  HL-225              N/A  | 0000:9b:00.0     N/A |                   0  |
| N/A   24C   N/A 100W /  600W  |   768MiB /  98304MiB |     0%           N/A |
|-------------------------------+----------------------+----------------------+
|   3  HL-225              N/A  | 0000:34:00.0     N/A |                   0  |
| N/A   23C   N/A  75W /  600W  |   768MiB /  98304MiB |     0%           N/A |
|-------------------------------+----------------------+----------------------+
|   4  HL-225              N/A  | 0000:b3:00.0     N/A |                   0  |
| N/A   23C   N/A  82W /  600W  |   768MiB /  98304MiB |     0%           N/A |
|-------------------------------+----------------------+----------------------+
|   5  HL-225              N/A  | 0000:4d:00.0     N/A |                   0  |
| N/A   23C   N/A  93W /  600W  |   768MiB /  98304MiB |     0%           N/A |
|-------------------------------+----------------------+----------------------+
|   6  HL-225              N/A  | 0000:4e:00.0     N/A |                   0  |
| N/A   22C   N/A  80W /  600W  |   768MiB /  98304MiB |     0%           N/A |
|-------------------------------+----------------------+----------------------+
|   7  HL-225              N/A  | 0000:b4:00.0     N/A |                   0  |
| N/A   21C   N/A  62W /  600W  |   768MiB /  98304MiB |     0%           N/A |
|-------------------------------+----------------------+----------------------+
| Compute Processes:                                               AIP Memory |
|  AIP       PID   Type   Process name                             Usage      |
|=============================================================================|
|   0        N/A   N/A    N/A                                      N/A        |
|   1        N/A   N/A    N/A                                      N/A        |
|   2        N/A   N/A    N/A                                      N/A        |
|   3        N/A   N/A    N/A                                      N/A        |
|   4        N/A   N/A    N/A                                      N/A        |
|   5        N/A   N/A    N/A                                      N/A        |
|   6        N/A   N/A    N/A                                      N/A        |
|   7        N/A   N/A    N/A                                      N/A        |
+=============================================================================+&lt;/LI-CODE&gt;&lt;P&gt;Could this be on how the SW stack was installed? We didn't use sudo for this though&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Feb 2025 23:50:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1668063#M50</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-02-19T23:50:09Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1668347#M51</link>
      <description>&lt;P&gt;&lt;STRONG&gt;This tells me that your Intel Gaudi software drivers are installed correctly, have the correct permissions and are available to docker.&lt;/STRONG&gt; It seems like your problem is with the habana container runtime and how it initializes the devices in the container's environment. It could be how the container runtime was installed and configured, but we reinstalled and checked the configuration of a container; nothing looked wrong. I will check with the ACE team for next steps.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Feb 2025 14:28:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1668347#M51</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-02-20T14:28:54Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676537#M57</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/167889"&gt;@James_Edwards&lt;/a&gt;&amp;nbsp;, do you have any update or new insight on the matter? Did the&amp;nbsp;&lt;SPAN&gt;ACE team give support?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 18:10:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676537#M57</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-03-20T18:10:37Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676566#M58</link>
      <description>&lt;P&gt;Sorry for not updating the post sooner. The ACE team had never encountered this issue before and were unable to reproduce this as well. I will see if I can get any information from them on debugging.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 20:08:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676566#M58</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-03-20T20:08:03Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676603#M59</link>
      <description>&lt;P&gt;Can we confirm that your system is using cgroups version 2? Run:&lt;/P&gt;&lt;P&gt;`grep cgroup /proc/filesystems`&lt;/P&gt;&lt;P&gt;and paste the output.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 22:33:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676603#M59</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-03-20T22:33:25Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676606#M61</link>
      <description>&lt;P&gt;What I get is:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;$ grep cgroup /proc/filesystems
nodev   cgroup
nodev   cgroup2&lt;/LI-CODE&gt;</description>
      <pubDate>Thu, 20 Mar 2025 23:12:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676606#M61</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-03-20T23:12:31Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676911#M62</link>
      <description>&lt;P&gt;R&amp;amp;D had me create this issue:&amp;nbsp;&lt;A href="https://habana.atlassian.net/browse/HS-5598" target="_blank"&gt;[HS-5598] Gaudi accelerators not appearing in Docker containers - Habana container runtime is correctly configured - Habana Support&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 21 Mar 2025 19:00:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676911#M62</guid>
      <dc:creator>James_Edwards</dc:creator>
      <dc:date>2025-03-21T19:00:40Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676937#M64</link>
      <description>&lt;P&gt;Thanks. Same as &lt;A title="1663854" href="https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Insufficient-Slots-for-MPIJob-with-2-Worker-Pods-and-2-Gaudi/m-p/1663854/thread-id/28" target="_blank" rel="noopener"&gt;1663854&lt;/A&gt;,&amp;nbsp;&lt;SPAN&gt;I don't have access to Habana's Jira, so please keep me posted if there's any update on the issue.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 21 Mar 2025 21:58:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1676937#M64</guid>
      <dc:creator>Gera_Dmz</dc:creator>
      <dc:date>2025-03-21T21:58:25Z</dc:date>
    </item>
    <item>
      <title>Re: Error starting containers using habana-container-runtime</title>
      <link>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1677790#M65</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/251729"&gt;@Gera_Dmz&lt;/a&gt;&amp;nbsp;,&lt;BR /&gt;&lt;BR /&gt;Please share all installed packages,&amp;nbsp;&lt;BR /&gt;sudo apt list --installed | grep habana&lt;BR /&gt;&lt;BR /&gt;Uninstall all dockers and any k8s packages if installed.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Reboot the system.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Install dockers packages and Habana packages and try again.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Mar 2025 23:57:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Gaudi-AI-Accelerator/Error-starting-containers-using-habana-container-runtime/m-p/1677790#M65</guid>
      <dc:creator>AungSan</dc:creator>
      <dc:date>2025-03-25T23:57:50Z</dc:date>
    </item>
  </channel>
</rss>

