Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28337 Discussions

Can GPUs be used to speed up MPI computations

Anders_S_1
New Contributor III
703 Views

Hi,

I am running FP-heavy computations using Fortran and MPI under Windows. It is not possible to

use OpenMP. Is it possible to offload FP calculations to a GPU to get a speed up? Do GPUs offer double precision calculations?

Best regards

Anders S

Labels (1)
0 Kudos
1 Solution
Barbara_P_Intel
Moderator
635 Views

Here's some info to help you get started.

Three Quick, Practical Examples of OpenMP Offload to GPUs (video)

https://www.intel.com/content/www/us/en/developer/videos/three-quick-practical-examples-openmp-offload-gpus.html

Run HPC Applications on CPUs & GPUs with Xe Architecture Using Intel® C++ & Intel® Fortran Compilers with OpenMP* (video)

Run HPC Applications on CPUs & GPUs with Xe Architecture Using Intel C++ and Intel Fortran Compilers with OpenMP

Basic understanding of GPU architecture

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-gpu-optimization-guide/top/xe-arch.html

oneAPI GPU Optimization Guide

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-gpu-optimization-guide/top.html

 

As far as a "sandbox", use Intel DevCloud for oneAPI.

 

View solution in original post

5 Replies
Barbara_P_Intel
Moderator
674 Views

For a Fortran application to offload to an Intel GPU, OpenMP directives are required. MPI optimizations around offload are related to managing where a given rank offloads.

The availability of double precision FP calculations varies by the Intel GPU. For example, the GPUs targeted for gaming have single precision FP.

 

Anders_S_1
New Contributor III
671 Views

Hi Barbara,

Thanks for your swift reply!

Which Intel GPUs offer double precision?

If I understood you right, work can be offloaded from each MPI rank using OpenMP directives. Is there any rule of thumb when offload to a GPU will result in a speedup as a function of size of the MPI task?

Is there any sample code or example on offload of a double precision workload to a GPU?

Is it possible to evaluate a simple example in a "sandbox"?

Best regards

Anders S

0 Kudos
Barbara_P_Intel
Moderator
636 Views

Here's some info to help you get started.

Three Quick, Practical Examples of OpenMP Offload to GPUs (video)

https://www.intel.com/content/www/us/en/developer/videos/three-quick-practical-examples-openmp-offload-gpus.html

Run HPC Applications on CPUs & GPUs with Xe Architecture Using Intel® C++ & Intel® Fortran Compilers with OpenMP* (video)

Run HPC Applications on CPUs & GPUs with Xe Architecture Using Intel C++ and Intel Fortran Compilers with OpenMP

Basic understanding of GPU architecture

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-gpu-optimization-guide/top/xe-arch.html

oneAPI GPU Optimization Guide

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-gpu-optimization-guide/top.html

 

As far as a "sandbox", use Intel DevCloud for oneAPI.

 

Anders_S_1
New Contributor III
573 Views

Hi Barbara,

As for the sandbox, will it be possible to evaluate offload to the MAX 1100 GPU (or similar) in the near future?

Best regards

Anders S

0 Kudos
Barbara_P_Intel
Moderator
544 Views

I don't know the roadmap for DevCloud. There is a Community Forum for DevCloud questions.

0 Kudos
Reply