We had a problem with inconsistent results across some of our grid nodes, which I thought was worth sharing. After investigation we pinned this down to two different OS configurations returning different results:
- Baremetal windows 2008
- Virtual windows 2008 running in KVM on RHEL
Both of the machines are identical in terms of hardware (Xeon E7-4870), which supports SSE4.1/2. At the time we were using MKL v11.1.2.
We use MKL’s CNR mode to force SIMD to use only SSE3 instructions, thus achieving numerical consistency across a range of hardware. What we discovered was that on the VM, the call to MKL_CBWR_Get_Auto_Branch was returning SSE3, and as a result we were not calling ::MKL_CBWR_Set(SSE3). Subsequently calculations on that machine were actually using SSE4 instructions, and this turned out to be the source of the numerical differences we were seeing.
The only numerical differences we saw between SSE3/SSE4 emanated from BLAS, although this may be circumstantial.
Although this was easily fixed (by always calling ::MKL_CBWR_Set(SSE3) regardless of what MKL_CBWR_Get_Auto_Branch returns) it took a great deal of investigation to pinpoint the problem.
Whether this issue stems from KVM rather than MKL itself I simply do not know, but thought it was worth sharing.
It was the dsyev routine (eigenvalues of symmetric matrix), but like I said this may be circumstantial. The point was we weren't setting CNR mode at all because of the incorrect result returned by MKL_CBWR_Get_Auto_Branch through the KVM, so I'd expect that the entire MKL was using SSE4 instructions.
But obviously you are in a better position to judge this.