The vendor has scomposer_xe_2011_sp1.11.339which I used for the tests. The mkl fromcomposerxe-2011.3.174 (which I had access to) is slightly better, but not a lot. From /proc/cpuinfo these areIntel Xeon CPU E5-2660 0 @ 2.20GHz machines, 16 cores per node with IB, openmpi-1.4.5.
I have posted information on a listerver for the specific DFT code (Wien2k) since others may want to start using Intel MPI a the current version seems to be rather good.
The "HAMILT" and "HNS" parts of the code are mainly simple mpi, i.e. spliiting of the effort over different machines. Both scale well with both Intel MPI and openmpi, with openmpi being perhaps slightly faster although the difference was small enough to be noise.