Edge Software Catalog
Support for issues related to Edge Software Catalog
473 Discussions

Failed to run ESQ for AI Edge System

SCHuang
Beginner
7,966 Views

We are helping customer to run ESQ.

But we have error during the processing.

 

here's the history steps that we run, we followed the guide on the websites and seem not missing any steps,

 

 1 sudo apt update
 2 sudo apt upgrade
 3 sudo apt install libnuma-dev python-dev jq
 4 sudo apt install libnuma-dev python3-dev jq
 5 export HF_TOKEN=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
 6 sudo apt-get install ca-certificates curl
 7 sudo install -m 0755 -d /etc/apt/keyrings
 8 sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
 9 sudo chmod a+r /etc/apt/keyrings/docker.asc
 10 echo  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
 11  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" |  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
 12 sudo apt-get update
 13 docker -h
 14 sudo groupadd docker
 15 sudo usermod -aG docker $USER
 16 docker run hello-world
 17 cd edge_system_qualification/
 18 chmod +x edgesoftware
 19 ./edgesoftware install
 20 sudo apt install python3-venv
 21 python3 -m venv venv
 22 ./edgesoftware install 
 23 reboot
 24 cd ~/esq  
 25 esq --version
 26 ls
 27 esq --version
 28 esq module list
 29 esq run --verbose
 30 esq run
 31 esq --help
 32 export HF_TOKEN=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
 33 esq run --verbose
 34 esq --verbose run
 36 esq module list
 37 esq run
 38 history
 39 esq --verbose run
 40 history

 

 

And we got error like this:

Traceback (most recent call last):

 File "/home/synnex-fae/esq/modules/ai-edge-system/genAI/run.py", line 175, in <module>

  main()

 File "/home/synnex-fae/esq/modules/ai-edge-system/genAI/run.py", line 113, in main

  build_runner_containers(dockerfile="Dockerfile.cpu", tag="ipex-cpu-only-runner", force_rebuild=True)

 File "/home/synnex-fae/esq/modules/ai-edge-system/genAI/run.py", line 56, in build_runner_containers

  client.images.build(path=script_dir, dockerfile=dockerfile, tag=tag, nocache=True)

 File "/home/synnex-fae/esq/modules/ai-edge-system/.venv/lib/python3.12/site-packages/docker/models/images.py", line 304, in build

  raise BuildError(chunk['error'], result_stream)

docker.errors.BuildError: The command '/bin/sh -c bash src/installations/install_cpu_deps.sh' returned a non-zero code: 1

An error occurred while running command: ['python3', 'run.py']

ERROR: Fail to run module: Command '/bin/sh -c "python3 -m venv .venv && . .venv/bin/activate && python3 -m pip install -r requirements.txt && python3 run.py"' returned non-zero exit status 1.

(this error message is from history step:34)

 

And we tried multiple time,

the esq command still not able to work,

for history step:37, the esq stuck at "[SYS] - Running Intel Core IPEX GPU analysis" for more than 24hrs, then we terminate the process,



then we try again at history step:39, the error change to:

[SYS] - Running Intel Core IPEX GPU analysis

Traceback (most recent call last):

 File "/home/synnex-fae/esq/modules/ai-edge-system/.venv/lib/python3.12/site-packages/docker/api/client.py", line 275, in _raise_for_status

  response.raise_for_status()

 File "/home/synnex-fae/esq/modules/ai-edge-system/.venv/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status

  raise HTTPError(http_error_msg, response=self)

requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.50/containers/17fde9458d558bc552deb4ac6bbc68f811d8357079e2886ae8d50219fa610fee/stop

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

 File "/home/synnex-fae/esq/modules/ai-edge-system/genAI/run.py", line 175, in <module>

  main()

 File "/home/synnex-fae/esq/modules/ai-edge-system/genAI/run.py", line 146, in main

  container.stop()

 File "/home/synnex-fae/esq/modules/ai-edge-system/.venv/lib/python3.12/site-packages/docker/models/containers.py", line 452, in stop

  return self.client.api.stop(self.id, **kwargs)

      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 File "/home/synnex-fae/esq/modules/ai-edge-system/.venv/lib/python3.12/site-packages/docker/utils/decorators.py", line 19, in wrapped

  return f(self, resource_id, *args, **kwargs)

      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 File "/home/synnex-fae/esq/modules/ai-edge-system/.venv/lib/python3.12/site-packages/docker/api/container.py", line 1212, in stop

  self._raise_for_status(res)

 File "/home/synnex-fae/esq/modules/ai-edge-system/.venv/lib/python3.12/site-packages/docker/api/client.py", line 277, in _raise_for_status

  raise create_api_error_from_http_exception(e) from e

     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 File "/home/synnex-fae/esq/modules/ai-edge-system/.venv/lib/python3.12/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception

  raise cls(e, response=response, explanation=explanation) from e

docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.50/containers/17fde9458d558bc552deb4ac6bbc68f811d8357079e2886ae8d50219fa610fee/stop: Not Found ("No such container: 17fde9458d558bc552deb4ac6bbc68f811d8357079e2886ae8d50219fa610fee")

An error occurred while running command: ['python3', 'run.py']

ERROR: Fail to run module: Command '/bin/sh -c "python3 -m venv .venv && . .venv/bin/activate && python3 -m pip install -r requirements.txt && python3 run.py"' returned non-zero exit status 1.

 

It seems that everytime we run gets different error, how these error happened?

 

Also, from the websites, it shows the requirement of the device is "At least 2 GB RAM, At least 32 GB hard drive." Is that true? It seems to take all our 256GB SSD space to run the command.

 

> Ultra 5 135U, 8GB RAM, 256GB SSD, Ubuntu 24.04

 

// Update

Replaced the 256GB SSD with 512GB and reinstalled OS and ESQ, the error like history step:39 still happened.

0 Kudos
1 Solution
JesusE_Intel
Moderator
4,345 Views

Hi SCHuang,


Intel® ESQ for Intel® AI Edge Systems version 11.2.2 has been published, resolving the previous issues. Please install this version on a fresh OS installation or clean the environment using the steps I provided earlier.


Regards,

Jesus


View solution in original post

0 Kudos
8 Replies
Peh_Intel
Moderator
7,879 Views

Hi SCHuang,


Thanks for sharing the detailed steps. We are aware of this matter and currently still investigating the issue. We will updated you at the earliest.



Regards,

Peh


0 Kudos
JesusE_Intel
Moderator
7,234 Views

The issue is caused by conflicts in python packages due to recent changes in 3rd party dependency versions. To resolve this, please apply the following patch.

 

Steps: 

1. Uninstall the ESQ package completely first:

 # Inside the software package location

  • ./edgesoftware uninstall -a
  •  cd ~/
  • sudo rm -rf ~/esq

2. Reinstall the ESQ as usual following the official guide.

3. Ensure that the environment is thoroughly cleaned. Running the command below:

  • docker container prune -f && docker images -q | xargs docker rmi -f
  • docker system prune -f 
  • cd ~/
  • sudo rm -rf .cache

4. Apply the patch file ESQ 11.2.1-Intel AI Edge System that is attached in this comment.

  • Copy the patch file to ~/esq
  • Run "cd ~/esq"
  • Run "patch -p1 < final.patch"
  • Paste "modules/ai-edge-system/genAI/Dockerfile" for the file to patch
  • For the second time it asks, paste "modules/ai-edge-system/genAI/Dockerfile.vllm.ov" for the file to patch.

5. Proceed to the normal ESQ run command. 

 

 

0 Kudos
SCHuang
Beginner
6,669 Views

Although I was able to successfully generate a result using the patch, the result still indicates a failure.

Gen AI

Gen AI - TinyLlama-1.1B-Chat-v1.0 (iGPU) Throughput Benchmark (tokens/sec)

number

gt

38

-1.0

 

FAILED

Gen AI

Gen AI - Llama-3.1-8B-Instruct (iGPU) Throughput Benchmark (tokens/sec)

number

gt

9

-1.0

 

FAILED

Gen AI

Gen AI - Phi-3-mini-4k-instruct (iGPU) Throughput Benchmark (tokens/sec)

number

gt

15

-1.0

 

FAILED

-Vision AI

 

 

 

 

 

 

 

Vision AI

EfficientNet-b0-INT8 YOLOv5s-INT8 (CPU)

number

gt

4.0

2.0

number of stream

FAILED

Vision AI

EfficientNet-b0-INT8 YOLOv5s-INT8 (GPU)

number

gt

6.0

5.0

number of stream

FAILED

 

At first, I assumed it might be a performance issue, but it now seems there may be additional factors involved.

The GenAI test results consistently returned "-1", which could be due to a failed model download. I noticed the data directory was owned by root and empty.

 

After changing the directory permissions to user and re-running the test, the "Phi-3-mini-4k-instruct" model did appear in the data directory.

 

However, the test got stuck at "Running Intel Core IPEX GPU analysis" with no progress info or error messages, and neither CPU nor network activity was observed.

It appears that, despite the patch, there are still unresolved issues within the ESQ for AI Edge System workflow.

0 Kudos
JesusE_Intel
Moderator
6,435 Views

Hi SCHuang,


I haven't been able to reproduce the issue you're experiencing, so I'll need to consult with the engineering team. In the meantime, could you try installing Ubuntu 22.04 and running the tests again?


From the documentation:

Note Ubuntu 22.04.3 LTS is preferred for Intel® Core™ and Intel® Core™ Ultra processors. Ubuntu 22.04.4 LTS Server is preferred for Intel® Xeon® processors.


Regards,

Jesus


0 Kudos
SCHuang
Beginner
5,758 Views

Before I tried using Ubuntu 22.04,

I tried to download TinyLlama-1.1B-Chat-v1.0 and Llama-3.1-8B-Instruct manually, then copy to the "data" directory, then I can generate report with actual value. So I think there might be some bugs for downloading the LLM models.

Gen AI

Gen AI - TinyLlama-1.1B-Chat-v1.0 (iGPU) Throughput Benchmark (tokens/sec)

number

gt

38

44.5

 

PASSED

Gen AI

Gen AI - Llama-3.1-8B-Instruct (iGPU) Throughput Benchmark (tokens/sec)

number

gt

9

7.49

 

FAILED

Gen AI

Gen AI - Phi-3-mini-4k-instruct (iGPU) Throughput Benchmark (tokens/sec)

number

gt

15

14.94

 

FAILED

-Vision AI

 

 

 

 

 

 

 

Vision AI

EfficientNet-b0-INT8 YOLOv5s-INT8 (CPU)

number

gt

4.0

2.0

number of stream

FAILED

Vision AI

EfficientNet-b0-INT8 YOLOv5s-INT8 (GPU)

number

gt

6.0

5.0

number of stream

FAILED

 

And I'm not sure why the result failed especially the Vision AI part, what means "number of stream: 2" and expect to be 4 in the report?

0 Kudos
JesusE_Intel
Moderator
5,196 Views

Hi SCHuang,


Thank you for reporting the download issues. The team will investigate and address this in the next software release.


Regarding the test case, it failed because it was designed to run 4 streams at 14.95 FPS, but the device could only handle 2 streams on the CPU. The same applies to the GPU.


Regards,

Jesus


0 Kudos
JesusE_Intel
Moderator
4,346 Views

Hi SCHuang,


Intel® ESQ for Intel® AI Edge Systems version 11.2.2 has been published, resolving the previous issues. Please install this version on a fresh OS installation or clean the environment using the steps I provided earlier.


Regards,

Jesus


0 Kudos
JesusE_Intel
Moderator
3,829 Views

If you need further assistance, please start a new discussion as this one will no longer be monitored.


0 Kudos
Reply