- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I try to develop a code based on multi-devices (DPC++ & MPI), I use USM and shared memory. When I do the scaling work, I find that the multi-device performance is worse than the single device performance. I think the problem scale is large enough, so multi-device should work better. Does anyone have any advice for that?
By the way, is there a way to make sure that I am using 16 GPUs when I run the problem using "mpirun -np 16 ./main"? I output the name of the devices, but they have the same name which is the same brand I think.
Thanks,
Chunheng.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting in Intel Communities.
Could you please provide us with the following details?
- The operating system you are using.
- Intel MPI Library & DPC++ versions you are using.
- A sample reproducer code and steps to reproduce your issue from our end. (commands to compile & run the code on multi-devices)
- Name of GPU you are using & Environment details of your cluster.
>>"I find that the multi-device performance is worse than the single device performance."
Could you please let us know how you are measuring the performance?
Thanks & Regards,
Santosh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I also attach my makefile here,
Chunheng.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I attach my code below.
I run my code on ThetaGPU.
The system information is: #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021
The GPU I use is: Selected device: NVIDIA A100-SXM4-40GB
I am not quite sure about the DPC++ or OneAPI version, but it is for Ubuntu 18.04.
I measure the performance by mega lattice updates per second, run the solver by 100 times and get the average time.
Chunheng.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chunheng,
Thank you for your inquiry. We offer support for hardware platforms that the Intel® oneAPI product supports. These platforms include those that are part of the Intel® Core™ processor family or higher, the Intel® Xeon® processor family, the Intel® Xeon® Scalable processor family, and others which can be found here – Intel® oneAPI Base Toolkit System Requirements, Intel® oneAPI HPC Toolkit System Requirements, Intel® oneAPI IoT Toolkit System Requirements
If you wish to use oneAPI on hardware that is not listed at one of the sites above, we encourage you to visit and contribute to the open oneAPI specification - https://www.oneapi.io/spec/
Best regards,
Jyotsna
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We are closing this issue. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.
Thanks,
Santosh
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page