oneMKL performance vs CUDA LAPACK

Intel® oneAPI Math Kernel Library

Ask questions and share information with other developers who use Intel® Math Kernel Library.

oneMKL performance vs CUDA LAPACK

120 Views

Hi all,

We have a simulation code written in C++ that solves Maxwell integral equations. MKL has been a great choice for us for several years and now we intend to compare the performance when the calculations are offloaded to GPUs. Our resources are enhanced with NVIDIA GPUs and we have implemented matrix factorization and solutions i.e. zgetrf + zgetrs since our data are complex double. Our measurements show that MKL on with 32 cores (CPU) is much faster than 2 NVIDIA Tesla T4 cards where each card has 2560 GPU cores. This is surprising and I wanted to first of all ask if anyone else has had such experience? Second, if this will be the case when Intel GPUs are used?

For GPU, we use native CUDA implementation of LAPACK which is called cuSolver.

Regards,

Dan

Link Copied

0 Replies

Community support is provided during standard business hours (Monday to Friday 7AM - 5PM PST). Other contact methods are available here.

Intel does not verify all solutions, including but not limited to any file transfers that may appear in this community. Accordingly, Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

For more complete information about compiler optimizations, see our Optimization Notice.