- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Is there any guide online to guide how to utilize Multiple XPU of model training/fine-tuneing?
I tried to use accelerate, Pytorch dataparallel but it not worked.
- Tags:
- Multi-XPU support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aza,
Thanks for your patience. We've received feedback from the relevant team.
For your information, we are sorry to tell you that utilization of all XPUs in the Jupyter Lab is not available because users are only given access to one shared GPU.
Sorry for the inconvenience and thank you for your support.
Regards,
Wan
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aza,
Thanks for reaching out to us.
You may refer to PyTorch 2.4 on Intel® GPUs to learn the basics on how to use PyTorch 2.4 with Intel GPUs. PyTorch 2.4 on Intel® GPUs is designed for developers, data scientists, and AI enthusiasts who want to leverage the capabilities of PyTorch on Intel GPUs (XPUs).
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That guide is only for single XPU usage
I want one for multiple XPU
as you can see in the environment there are 8 XPU available.
@idc-training-gpu-compute-22:~$ xpu-smi discovery
+-----------+--------------------------------------------------------------------------------------+
| Device ID | Device Information |
+-----------+--------------------------------------------------------------------------------------+
| 0 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-000f-0000-002f0bda8086 |
| | PCI BDF Address: 0000:0f:00.0 |
| | DRM Device: /dev/dri/card1 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 1 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-0016-0000-002f0bda8086 |
| | PCI BDF Address: 0000:16:00.0 |
| | DRM Device: /dev/dri/card2 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 2 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-001a-0000-002f0bda8086 |
| | PCI BDF Address: 0000:1a:00.0 |
| | DRM Device: /dev/dri/card3 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 3 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-001e-0000-002f0bda8086 |
| | PCI BDF Address: 0000:1e:00.0 |
| | DRM Device: /dev/dri/card4 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 4 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-008a-0000-002f0bda8086 |
| | PCI BDF Address: 0000:8a:00.0 |
| | DRM Device: /dev/dri/card5 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 5 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-008e-0000-002f0bda8086 |
| | PCI BDF Address: 0000:8e:00.0 |
| | DRM Device: /dev/dri/card6 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 6 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-00c0-0000-002f0bda8086 |
| | PCI BDF Address: 0000:c0:00.0 |
| | DRM Device: /dev/dri/card7 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
| 7 | Device Name: Intel(R) Data Center GPU Max 1100 |
| | Vendor Name: Intel(R) Corporation |
| | SOC UUID: 00000000-0000-00c4-0000-002f0bda8086 |
| | PCI BDF Address: 0000:c4:00.0 |
| | DRM Device: /dev/dri/card8 |
| | Function Type: physical |
+-----------+--------------------------------------------------------------------------------------+
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aza,
Thanks for the information.
Let me check with relevant team and I'll update you as soon as possible.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aza,
Thanks for your patience. We've received feedback from the relevant team.
For your information, we are sorry to tell you that utilization of all XPUs in the Jupyter Lab is not available because users are only given access to one shared GPU.
Sorry for the inconvenience and thank you for your support.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Between is there any guide to use multiple xpu for knowledge purposes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aza,
You may use DataParallel with Intel® Extension for PyTorch.
For more information, please refer to Multi-GPU AI Training (Data-Parallel) with Intel® Extension for PyTorch* | Intel Software.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aza,
Thanks for your question.
If you need additional information from Intel, please submit a new question as this thread will no longer be monitored.
Regards,
Wan
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page