Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

CNR mode reporting incorrect SIMD version via KVM

Andrew_Smith1
Beginner
211 Views

Hi

We had a problem with inconsistent results across some of our grid nodes, which I thought was worth sharing. After investigation we pinned this down to two different OS configurations returning different results:

  • Baremetal windows 2008
  • Virtual windows 2008 running in KVM on RHEL

Both of the machines are identical in terms of hardware (Xeon E7-4870), which supports SSE4.1/2. At the time we were using MKL v11.1.2.

We use MKL’s CNR mode to force SIMD to use only SSE3 instructions, thus achieving numerical consistency across a range of hardware. What we discovered was that on the VM, the call to MKL_CBWR_Get_Auto_Branch was returning SSE3, and as a result we were not calling ::MKL_CBWR_Set(SSE3). Subsequently calculations on that machine were actually using SSE4 instructions, and this turned out to be the source of the numerical differences we were seeing.

The only numerical differences we saw between SSE3/SSE4 emanated from BLAS, although this may be circumstantial.

Although this was easily fixed (by always calling ::MKL_CBWR_Set(SSE3) regardless of what MKL_CBWR_Get_Auto_Branch returns) it took a great deal of investigation to pinpoint the problem.

Whether this issue stems from KVM rather than MKL itself I simply do not know, but thought it was worth sharing.

Thanks,

 

0 Kudos
2 Replies
Gennady_F_Intel
Moderator
211 Views

thanks for the report Andrew. We have never checked how we work in such environments. Nevertheless, which BLAS's functions you use?

0 Kudos
Andrew_Smith1
Beginner
211 Views

Gennady,

It was the dsyev routine (eigenvalues of symmetric matrix), but like I said this may be circumstantial. The point was we weren't setting CNR mode at all because of the incorrect result returned by MKL_CBWR_Get_Auto_Branch through the KVM, so I'd expect that the entire MKL was using SSE4 instructions.

But obviously you are in a better position to judge this.

Thanks,

0 Kudos
Reply