Showing results for 
Search instead for 
Did you mean: 

Should I expect this difference from DGETRS?

Hi, I've got a program that uses MKL and the Intel Composer 2015 compiler on Mac OS X that is giving me different results on an Ivy Bridge (a MacMini6,1 system with an i5-3210M CPU) and a Haswell (MacMini7,1 with an i5-4278U CPU) for the same binary.  Both systems are running Yosemite (10.10.5).  I've boiled it down to the following test case:

#include <mkl.h>
int main(int argc, char **argv) {
  int m=3, n=3, lda=5, ldb=3, nrhs=1, info;
  double a[] = {
    1., 2., 3., 0., 0.,
    4., 5., 6., 0., 0.,
    7., 8., 0., 0., 0.,
    0., 0., 0., 0., 0.
  double b[] = { 6., 15., 15. };
  int ipiv[3];
  int i;

  dgetrf(&m, &n, a, &lda, ipiv, &info);
  for(i = 0; i < 3, i++) printf("ipiv[%d] = %d (expected 3)\n", i, ipiv);
  dgetrs("T", &n, a, &lda, ipiv, b, &ldb, &info);
  for(i = 0; i < 3; i++) printf("b[%d] = %.17f (expected 1.00000000000000000)\n", i, b);

Since our application uses OpenMP and statically links MKL, the following Makefile should build the test case in the same way:

repro: repro.o
        icc -qopenmp -o repro repro.o ${MKLROOT}/lib/libmkl_intel_lp64.a ${MKLROOT}/lib/libmkl_core.a ${MKLROOT}/lib/libmkl_intel_thread.a -lpthread -lm -ldl
repro.o: repro.c
        icc -qopenmp -I${MKLROOT}/include -o repro.o -c repro.c

On the Ivy Bridge system, I get exactly the expected values.  My problem is on the Haswell system, where I get the following results:

ipiv[0] = 3 (expected 3)
ipiv[1] = 3 (expected 3)
ipiv[2] = 3 (expected 3)
b[0] = 0.99999999999999978 (expected 1.00000000000000000)
b[1] = 1.00000000000000022 (expected 1.00000000000000000)
b[2] = 1.00000000000000000 (expected 1.00000000000000000)

I realize that this difference is extremely insignificant, but it's the difference between the two systems (running the same version of the same OS, differing only based on the CPU) that has us concerned.  Is this something we should expect from MKL, or is it a bug?

0 Kudos
2 Replies
Black Belt

Similar behavior occurs on Windows, using the 16.0.2 compiler, 32-bit or 64-bit, on Sandy-Bridge i7 versus Broadwell i5. (The program has a couple of bugs: you need "include <stdio.h>", and in line-15 you should have a semicolon after "i < 3".)

The differences are equal to machine-epsilon for b[0] and b[1], so there is not much reason to worry. If, however, you want to enforce reproducibility, set an environment variable (or, equivalently, call a subroutine in your code to produce the same effect). Please read the article .

0 Kudos

it would also be useful to be familiar with conditions of this functionality. you may refer to these details follow the link


0 Kudos