actually, you may try to request this version from this page - https://software.intel.com/en-us/performance-libraries or submit ticket from Intel Online Service Center in the case if you have a valid license.
I see MKL-2017 is available in Conda, but there are complications.
I'm building a complex set of applications with limitations ranging from memory bandwidth to compute:cores*clock to floating point math (neural). -- To prove scale and limiting factors vs cost -- i.e. should we invest in one big cpu, a cluster, GPU, or mix?
I was pretty successful running Linux on the Phi, and learned a lot.
This was definitely off-use for a 1xx phi, this test is scaled down, but it still runs out of memory quickly.
About 47 threads was only just catching up to 1 current-gen core (or 4 threads on a 2-core Atom with only 2G RAM), due to the old in-order cores and slow clock. [around 32 threads, the app started needing virtual memory to complete; over 47 overall performance decreased]
htop screenshot: mid-run of an app on 57 threads
I see this promising benchmark that a Phi 7250 can beat a dual 32-core monster, or a GTX1080 at inference on AlexNet and GoogleNet
but a 1080 is much more affordable/available than the others...
Just disappointed I wont see my Phi running at full steam after all the work I put into it...