- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I am trying to measure the performance of A350M base on mlperf v3.1, 3d-unet-kits19.
I am using Ubuntu 22.04, Re-Size BAR Support enabled, running without docker.
Following my steps, I am able to complete the run using A770. When it comes to A350m, the data will not be calculated, and the dmesg print out:
Fence expiration time out i915-0000:03:00.0:python3[5119]:318!
Fence expiration time out i915-0000:03:00.0:python3[5119]:31a!
...
Fence expiration time out i915-0000:03:00.0:python3[5119]:a6!
Then, I ctrl-c to interrupt the process. Logs are attached as file : interruptedMlperfLogs.txt.
Here is how I implement the run:
1. Download intel apt repos:
1.1 Add apt repo signs to kernel:
$ sudo -v && \
wget -qO - https://repositories.intel.com/graphics/intel-graphics.key | \
sudo gpg --dearmor --output /usr/share/keyrings/intel-graphics.gpg && \
echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc" | \
sudo tee /etc/apt/sources.list.d/intel-gpu-jammy.list && \
sudo apt-get update
1.2 installing API, libs ... :
$ sudo apt-get install -y \
intel-opencl-icd intel-level-zero-gpu level-zero \
intel-media-va-driver-non-free libmfx1 libmfxgen1 libvpl2 \
libegl-mesa0 libegl1-mesa libegl1-mesa-dev libgbm1 libgl1-mesa-dev libgl1-mesa-dri \
libglapi-mesa libgles2-mesa-dev libglx-mesa0 libigdgmm12 libxatracker2 mesa-va-drivers \
mesa-vdpau-drivers mesa-vulkan-drivers va-driver-all vainfo hwinfo clinfo mesa-utils
1.3 installing toolkit:
$ sudo -v &&
wget -O- "https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB" \
| gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
$ echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
$ sudo apt update && sudo apt install intel-basekit intel-gpu-tools
1.4 Modify env variables to ~/.bashrc:
export ONEAPI_ROOT=/opt/intel/oneapi
export DPCPPROOT=${ONEAPI_ROOT}/compiler/latest
export MKLROOT=${ONEAPI_ROOT}/mkl/latest
export IPEX_XPU_ONEDNN_LAYOUT=1
source ${ONEAPI_ROOT}/setvars.sh > /dev/null
$ source ~/.bashrc
$ wget "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
$ bash Mambaforge-$(uname)-$(uname -m).sh -b
$ ~/mambaforge/bin/mamba init
$ bash
$ mamba create --name pytorch-arc python=3.11 -y
$ mamba activate pytorch-arc
$ python -m pip install torch==2.1.0a0 torchvision==0.16.0a0 torchaudio==2.1.0a0 intel-extension-for-pytorch==2.1.10+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
$ pip install datasets jupyter matplotlib pandas pillow timm torcheval torchtnt tqdm cjm_pandas_utils cjm_pil_utils cjm_pytorch_utils pybind11 scipy
$ python
Python 3.11.7 | packaged by conda-forge | (main, Dec 23 2023, 14:43:09) [GCC 12.3.0] on linux
>>> import torch
>>> import intel_extension_for_pytorch
>>> print(intel_extension_for_pytorch.__version__)
2.1.10+xpu
>>> torch.xpu.get_device_properties(0)
_DeviceProperties(name='Intel(R) Arc(TM) A350M Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu, support_fp64=0, total_memory=3845MB, max_compute_units=96, gpu_eu_count=96)
>>> exit()
$ sudo apt update && sudo apt install git build-essential libglib2.0-dev -y
$ git clone https://github.com/mlcommons/inference.git
$ cd inference
$ git fetch
$ git checkout v3.1
$ mamba activate pytorch-arc
$ pip install absl-py numpy nibabel imageio
$ cd inference/loadgen
$ CFLAGS="-std=c++14 -O3" python -m pip install .
$ cd vision/medical_imaging/3d-unet-kits19/
$ make setup
$ make preprocess_data
$ nano pytorch_SUT.py
import intel_extension_for_pytorch
(@ line 73)
...
# self.device = torch.device(
# "cuda:0" if torch.cuda.is_available() else "cpu")
self.device = torch.device("xpu")
self.model = torch.jit.load(model_path, map_location=self.device)
...
$ nano build/mlperf.conf
(@ line 64)
*.Offline.min_query_count = 30
$ mamba activate pytorch_arc
$ make run_pytorch_performance
$ sudo intel_gpu_top
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello JamesKuo,
I appreciate your engagement with our community.
To delve into the issues you're encountering with the Intel Arc A350M Graphics, could you please specify your computer's make and model? Additionally, it would be immensely helpful if you could share the Intel® System Support Utility Logs from your system. These logs are crucial for us to thoroughly assess your system's setup. If you're comfortable doing so, please attach the logs to your response in this thread.
Best regards,
Norman S.
Intel Customer Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello JamesKuo,
I wanted to check if you had the chance to review the questions I posted. Please let me know at your earliest convenience so that we can determine the best course of action to resolve this matter.
Best regards,
Norman S.
Intel Customer Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello JamesKuo,
I have not heard back from you so I will close this inquiry now. If you need further assistance, please submit a new question as this thread will no longer be monitored.
Best regards,
Norman S.
Intel Customer Support Engineer
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page