I have been using the Intel Python Dist. for a couple of months. The first tests I ran with Python3 were impressive. Just by swapping Python distro I was getting 4x plus speedups using Theano/TensorFlow.
Today I installed the distro on a Xeon virtual machine with 32 cores, and got the reverse surprise. So my IPD is 2x slower than cPython using Theano/TensorFlow. If I use explicitly TBB then it comes closer to cPython, but still a bit slower. Given this, I went back to test in my laptop (a core-i7, where I did not change the IPD, but have in the meanwhile updated the system cPython) and again IPD is slower than cPython.
I can only try to guess what happened, and I cannot reproduce exactly the environment I had before on my laptop. Maybe some bad update? How can I verify what may be going wrong?
In the laptop (ArchLinux) I managed to get it working "properly" again. The odd thing is that I have updated the system packages for cPython, after that I also ran 'pip install --upgrade theano', it just printed out the messages that everything was up-to-date and did nothing. After that I have again 4x speedup with the Intel distribution, using TBB explicitly makes no difference.
I reinstalled all Intel packages at the Xeon (Debian) virtual machine (IPP, TBB, MKL, DAAL) and did the same for the python distributions. Now the Intel distribution is on par with cPython, and I get 2x speedup if I use the TBB module. Can someone try to bet on what is happening with my installations/machines?
Weird enough as it is, the performance is about the same in the virtual machine as in the laptop, but I guess this is a matter for a different topic.
For the sake of completeness, this post referenced from the Python forum, explains how to check Theano's configuration for correctness (using mkl and pthread).
With this I am sure now that I have the correct installation. I still have performance issues in the Xeon virtual machine, but the I have consistent behavior between the two machines now. Intel distro (IDP) faster than cPython, without the use of TBB.