Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Eugene_G_
Beginner
201 Views

KNC to KNL - 2x Slower Performance - Same Code

We have an application that's currently running great in native mode on the KNC platform.

We now have a KNL system for R&D and have recompiled our native KNC application for the KNL platform. When testing this unmodified codebase, we're noticing a 2x performance degradation on KNL. KNL is setup in Quadrant cluster mode and Cache mode for memory.

Our application is not memory bandwidth hungry and we've tested many different OMP_NUM_THREAD configurations to no avail.
The main loop of the application is using OMP with a single critical section at the end. However, this runs very fast natively on the KNC platform.

(Intel Compiler - icpc)
KNC Compiler flags = -O3 -std=c++11 -openmp -mmic
KNL Compiler flags = -O3 -std=c++11 -qopenmp -xMIC-AVX512 -fma -align -finline-functions

We've run the standard tests and we know we can do a better job vectorizing loops but we were expecting better performance out of the box with an application that is already running great on KNC.

What could be causing this? Is it a pure vectorization issue?

0 Kudos
21 Replies
Kris__Kris
Beginner
18 Views

It may be a vectorization issue

Reply