Software Archive
Read-only legacy content
17061 Discussions

KNC to KNL - 2x Slower Performance - Same Code

Eugene_G_
Beginner
1,549 Views

We have an application that's currently running great in native mode on the KNC platform.

We now have a KNL system for R&D and have recompiled our native KNC application for the KNL platform. When testing this unmodified codebase, we're noticing a 2x performance degradation on KNL. KNL is setup in Quadrant cluster mode and Cache mode for memory.

Our application is not memory bandwidth hungry and we've tested many different OMP_NUM_THREAD configurations to no avail.
The main loop of the application is using OMP with a single critical section at the end. However, this runs very fast natively on the KNC platform.

(Intel Compiler - icpc)
KNC Compiler flags = -O3 -std=c++11 -openmp -mmic
KNL Compiler flags = -O3 -std=c++11 -qopenmp -xMIC-AVX512 -fma -align -finline-functions

We've run the standard tests and we know we can do a better job vectorizing loops but we were expecting better performance out of the box with an application that is already running great on KNC.

What could be causing this? Is it a pure vectorization issue?

0 Kudos
21 Replies
Kris__Kris
Beginner
115 Views

It may be a vectorization issue

0 Kudos
Reply