We have an application that's currently running great in native mode on the KNC platform.
We now have a KNL system for R&D and have recompiled our native KNC application for the KNL platform. When testing this unmodified codebase, we're noticing a 2x performance degradation on KNL. KNL is setup in Quadrant cluster mode and Cache mode for memory.
Our application is not memory bandwidth hungry and we've tested many different OMP_NUM_THREAD configurations to no avail.
The main loop of the application is using OMP with a single critical section at the end. However, this runs very fast natively on the KNC platform.