Intel® Tiber Developer Cloud
Help connecting to or getting started on Intel® Tiber Developer Cloud

Multi-XPU support?

aza
Novice
1,302 Views

Hello, 

Is there any guide online to guide how to utilize Multiple XPU of model training/fine-tuneing?

I tried to use accelerate, Pytorch dataparallel but it not worked.

 

Labels (1)
0 Kudos
1 Solution
Wan_Intel
Moderator
1,081 Views

Hi Aza,

Thanks for your patience. We've received feedback from the relevant team.

 

For your information, we are sorry to tell you that utilization of all XPUs in the Jupyter Lab is not available because users are only given access to one shared GPU.

 

Sorry for the inconvenience and thank you for your support.

 

 

Regards,

Wan


View solution in original post

0 Kudos
7 Replies
Wan_Intel
Moderator
1,264 Views

Hi Aza,

Thanks for reaching out to us.

 

You may refer to PyTorch 2.4 on Intel® GPUs to learn the basics on how to use PyTorch 2.4 with Intel GPUs. PyTorch 2.4 on Intel® GPUs is designed for developers, data scientists, and AI enthusiasts who want to leverage the capabilities of PyTorch on Intel GPUs (XPUs).

 

 

Regards,

Wan


0 Kudos
aza
Novice
1,220 Views

That guide is only for single XPU usage

I want one for multiple XPU 

as you can see in the environment there are 8 XPU available. 

@idc-training-gpu-compute-22:~$ xpu-smi discovery
+-----------+--------------------------------------------------------------------------------------+
| Device ID | Device Information |
+-----------+--------------------------------------------------------------------------------------+
| 0 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-000f-0000-002f0bda8086 |
| | PCI BDF Address: 0000:0f:00.0 |
| | DRM Device: /dev/dri/card1 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 1 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-0016-0000-002f0bda8086 |
| | PCI BDF Address: 0000:16:00.0 |
| | DRM Device: /dev/dri/card2 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 2 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-001a-0000-002f0bda8086 |
| | PCI BDF Address: 0000:1a:00.0 |
| | DRM Device: /dev/dri/card3 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 3 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-001e-0000-002f0bda8086 |
| | PCI BDF Address: 0000:1e:00.0 |
| | DRM Device: /dev/dri/card4 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 4 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-008a-0000-002f0bda8086 |
| | PCI BDF Address: 0000:8a:00.0 |
| | DRM Device: /dev/dri/card5 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 5 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-008e-0000-002f0bda8086 |
| | PCI BDF Address: 0000:8e:00.0 |
| | DRM Device: /dev/dri/card6 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 6 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-00c0-0000-002f0bda8086 |
| | PCI BDF Address: 0000:c0:00.0 |
| | DRM Device: /dev/dri/card7 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 7 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-00c4-0000-002f0bda8086 |
| | PCI BDF Address: 0000:c4:00.0 |
| | DRM Device: /dev/dri/card8 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+

0 Kudos
Wan_Intel
Moderator
1,184 Views

Hi Aza,

Thanks for the information.

 

Let me check with relevant team and I'll update you as soon as possible.

 

 

Regards,

Wan


0 Kudos
Wan_Intel
Moderator
1,082 Views

Hi Aza,

Thanks for your patience. We've received feedback from the relevant team.

 

For your information, we are sorry to tell you that utilization of all XPUs in the Jupyter Lab is not available because users are only given access to one shared GPU.

 

Sorry for the inconvenience and thank you for your support.

 

 

Regards,

Wan


0 Kudos
aza
Novice
1,046 Views

Between is there any guide to use multiple xpu for knowledge purposes 

0 Kudos
Wan_Intel
Moderator
1,025 Views
0 Kudos
Wan_Intel
Moderator
855 Views

Hi Aza,

Thanks for your question.


If you need additional information from Intel, please submit a new question as this thread will no longer be monitored.



Regards,

Wan


0 Kudos
Reply