Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

integer overflow in dcopy

Steven_V_
Beginner
943 Views

Hi,

I ran into a problem with using the dcopy subroutine of MKL. When compiling and running the included small test using gfortran and intel MKL version 11, it works fine on my Xeon machines, but fails on the Opteron machines. Both the integer*4 and integer*8 versions of the 64-bit libraries of MKL seem to have this problem. ACML has the same issue, but only in their integer*4 64-bit library. The test program works fine with the older version 10.1.3.027 of MKL.

greetings, Steven

0 Kudos
15 Replies
Gennady_F_Intel
Moderator
943 Views
Hi Steven, Which version of MKL do you use? I guess is that this is linux OS? How did you link the example? I noticed the task size is very big: PROGRAM TEST_DCOPY IMPLICIT REAL*8 (A-H,O-Z) ALLOCATABLE :: X(:), Y(:) DIMENSION N(2) N(1)=1220480320 ..... ALLOCATE(X(N(1))) ... ALLOCATE(Y(N(1))) ..... Do you have enough RAM on the systems where do you see the problem? --Gennady
0 Kudos
Steven_V_
Beginner
943 Views
Hi Gennady, this is on GNU/Linux 3.2.0-23 x86_64 SMP, GNU C Library (Ubuntu EGLIBC 2.15-0ubuntu10) with 512 GB of RAM. The test fails with MKL from composer_xe_2011_sp1.9.293 and composer_xe_2013.0.079, and only on opteron processors. I know the test is quite large, the actual dcopy is in a quantum chemistry code. However, even a 32-bit integer dcopy should handle 1.2 billion elements since it's a 64-bit library, so size_t can hold the number of bytes without a problem, as it did in the older version. Since it only fails on opteron, I guess it's somewhere in an architecture-specific routine.
0 Kudos
Gennady_F_Intel
Moderator
943 Views
Thanks Steven. I am asking just to know what we need to check on our side. Yes, 1.2 10^9 should be handled by 32bit integer. one more question - did you link with libmkl_sequential.a or libmkl_gnu_thread.a ?
0 Kudos
Steven_V_
Beginner
943 Views
I linked with libmkl_sequential.a. I just tried the libmkl_gnu_thread.a now and it fails too.
0 Kudos
Gennady_F_Intel
Moderator
943 Views
just for info: I am still couldn't find Opteron with such memory size. the test passed on Xeon with 32 Gb od RAM. ./test_dcopy 1097.52614198074 1097.52614198074
0 Kudos
Steven_V_
Beginner
943 Views
thanks for the info, this is also exactly what I get on Xeon E5630: ./test_dcopy 1097.52614198074 1097.52614198074 and this is what I get on Opteron 6276: ./test_dcopy 1097.52614198074 wrong, I = 146738497 X(I) = 3.141589835286140E-002 Y(I) = 1.12300002574921 Could you run the Opteron-specific code on a Xeon, or is that impossible? FYI, this is the discussion about the same issue, but with ACML: http://devgurus.amd.com/thread/159788
0 Kudos
Steven_V_
Beginner
943 Views
if necessary, I could try to find out how to give you access to one of our machines.
0 Kudos
Gennady_F_Intel
Moderator
943 Views
thanks for suggestion, it's not necessarily - I have already received the same results on AMD Opteron(tm) Processor 6282 SE with 32 Gb of RAM. We will check what's wrong.
0 Kudos
barragan_villanueva_
Valued Contributor I
943 Views
Steven, You wrote that this problem is on AMD machines with MKL and ACML. Could you check this test on Netlib?
0 Kudos
Steven_V_
Beginner
943 Views
Hi Victor, with my system's blas (Ubuntu 12.04) it works fine, and I think that this is based on the netlib implementation. Also, our own program's blas is also based on netlib and that works fine too. (The older MKL 10.1 and ACML 4.2.0 work fine too). But if necessary, I can compile the netlib dcopy and test it.
0 Kudos
Gennady_F_Intel
Moderator
943 Views
Steven, The problem is reproduced even when only is used DCOPY. I commented SQRT and DDOT and the problem is still exists: wrong, I = 146738497 X(I) = 3.141589835286140E-002 Y(I) = 1.12300002574921 the problem is escalated. we will let you know as soon as any update.
0 Kudos
Gennady_F_Intel
Moderator
943 Views

Hello Steven,

Would you please check the latest 11.0. update2? the problem has been fixed there. 

--Gennady

0 Kudos
Steven_V_
Beginner
943 Views

Hi,

I've installed version 11.0 update 2 of MKL and compiled my test program, linking with either -lmkl_sequential or -lmkl_gnu_thread, but in both cases the program segfaults, output from gdb attached.

0 Kudos
Vamsi_S_Intel
Employee
943 Views

Hi Steve,

Looking at the gdb bt log you attched, I notice that you are using the MKL ILP64 interface library (libmkl_gf_ilp64.so). When using the ilp64 interface library, the integers declared in the source program should be of 64-bit integer type. With gfortran, the relevant compiler flag which make sures integers are 64-bit length is -fdefault-integer-8. For Intel Fortran compiler, the correct option is -i8

Can you confirm that you are using -fdefault-integer-8 when compiling your program with gfortran and linking against the MKL gfortran ilp64 interface library?

--Vamsi.

0 Kudos
Steven_V_
Beginner
943 Views

Hi Vamsi,

thanks for catching that, I was too quick to test and indeed forgot that flag. Everything seems to work fine now, also in production the results match the Xeon machines. Thanks to everyone for your help.

greetings,

Steven

0 Kudos
Reply