I'm interested in finding out whether you have any advice for users interested in writing and compiling their own specialized C/C++-extensions? There is a class of user interested in taking advantage of KNL's threading and vectorization features in such specialized extensions. Will this be relatively straightforward with the Intel Distribution of Python? I already anticipate that VTune's support of Python would be a big help here too.
We don't have any specific advice. Our experience is that everything works out of the box with KNL. You just want to pay attention to the compiler switches so you will generate code that takes advantage of avx512. If you run into problems, please let us know.
Thanks for the response. The compiler switches are exactly what we're worried about. If the Python interpreter were compiled up front using icc and with all the right flags in place, my impression is that compiling C-extensions later on is much easier. But if that is not the plan, it would be most helpful if some documentation was prepared about making sure we had the compiler switches right. I'm thinking that effectively the IDP team knows what those should be, if they could document that it would be of great benefit to us.
Thanks again, really looking forward to the final release.
Yes, we need a simple document for people building native extensions and will work on one.
For the latest release, we used icc with -xSSE4.2 -axCORE-AVX2. We need to support all architectures and did not have time to measure the effect of adding an AVX512 tuning as recommended here:
It's something we need to do for the next release.
Dense linear algebra, fft and others rely on MKL, which has built-in tunings for all architectures and will be optimized for KNL. We have been doing benchmarking on KNL and the results look good.