Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7957 Discussions

Perfoermance regressions ICX vs ICL ( classic)

AndrewC
New Contributor III
1,146 Views

We are noticing a 20-25% drop in runtime performance on our benchmarks using ICX (2022 latest) vs ICL ( Classic 19.2). This is a heavily floating point intensive CAE code base running on Intel workstations - we use MKL extensively so the performance regression is surprising.  We will not be moving to ICX until we can get at least comparable runtimes.

It seems like the optimizations made by ICL at /O2 are ahead of the CLANG compiler (still). Are there some tips for getting better runtime performance out of CLANG?

0 Kudos
1 Solution
AndrewC
New Contributor III
1,042 Views

I found this page very helpful

https://www.intel.com/content/www/us/en/developer/articles/guide/porting-guide-for-icc-users-to-dpcpp-or-icx.html

In particular the use of a -fiopenmp and setting a processor target (e.g. AVX) to get optimizations
With the changes

  • Hack mkl_direct.h to allow use the ICX compiler
  • -fiopenmp
  • set AVX instruction set optimizations

    I was able to get comparable or better than ICL performance.

View solution in original post

0 Kudos
6 Replies
VarshaS_Intel
Moderator
1,126 Views

Hi,


Thanks for posting in Intel Communities.


Could you please provide us with the OS details and sample reproducer code along with the steps to reproducer your issue?


And also, could you please confirm whether you are using the latest oneAPI Toolkit(2022.2)?


Thanks & Regards,

Varsha


0 Kudos
AndrewC
New Contributor III
1,093 Views

I am using the latest kit (2022.2). I can't show a simple benchmark code. This is after rebuilding a large C++ code base. It is  floating point intensive, multi-threaded ( using OpenMP) and uses MKL for vector and matrix applications. The identical code base compiled with the latest ICL and the latest ICX  - we see performance regressions of 0-20% with ICX.
Not surprising to me. ICL was developed over many,many years with one goal in mind ( I have been using it since version 6.0) - high performance computing. CLANG was developed as an open source, flexible,extensible 'standards supporting' C++ compiler framework - HPC was not it's focus.

0 Kudos
Mentzer__Stuart
1,108 Views

I can echo Andrew's observation: I see a similar performance drop in an OpenMP modeling application in the ICX build on Windows.

I second the vote for not reducing ICC/ICL support until ICX reaches performance parity!

0 Kudos
AndrewC
New Contributor III
1,055 Views

It seems the problem could be the handling of MKL_DIRECT_CALL. We use small matrices in some places and MKL_DIRECT_CALL is a big win. It appears MKL_DIRECT_CALL is disabled for ICX as __INTEL_COMPILER is not defined.

0 Kudos
AndrewC
New Contributor III
1,043 Views

I found this page very helpful

https://www.intel.com/content/www/us/en/developer/articles/guide/porting-guide-for-icc-users-to-dpcpp-or-icx.html

In particular the use of a -fiopenmp and setting a processor target (e.g. AVX) to get optimizations
With the changes

  • Hack mkl_direct.h to allow use the ICX compiler
  • -fiopenmp
  • set AVX instruction set optimizations

    I was able to get comparable or better than ICL performance.
0 Kudos
VarshaS_Intel
Moderator
1,025 Views

Hi,


Glad to know that your issue is resolved. Thanks for sharing the solution with us. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


Thanks & Regards,

Varsha


0 Kudos
Reply