Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Steven_V_
Beginner
151 Views

integer overflow in dcopy

Hi,

I ran into a problem with using the dcopy subroutine of MKL. When compiling and running the included small test using gfortran and intel MKL version 11, it works fine on my Xeon machines, but fails on the Opteron machines. Both the integer*4 and integer*8 versions of the 64-bit libraries of MKL seem to have this problem. ACML has the same issue, but only in their integer*4 64-bit library. The test program works fine with the older version 10.1.3.027 of MKL.

greetings, Steven

0 Kudos
15 Replies
Gennady_F_Intel
Moderator
151 Views

Hi Steven, Which version of MKL do you use? I guess is that this is linux OS? How did you link the example? I noticed the task size is very big: PROGRAM TEST_DCOPY IMPLICIT REAL*8 (A-H,O-Z) ALLOCATABLE :: X(:), Y(:) DIMENSION N(2) N(1)=1220480320 ..... ALLOCATE(X(N(1))) ... ALLOCATE(Y(N(1))) ..... Do you have enough RAM on the systems where do you see the problem? --Gennady
Steven_V_
Beginner
151 Views

Hi Gennady, this is on GNU/Linux 3.2.0-23 x86_64 SMP, GNU C Library (Ubuntu EGLIBC 2.15-0ubuntu10) with 512 GB of RAM. The test fails with MKL from composer_xe_2011_sp1.9.293 and composer_xe_2013.0.079, and only on opteron processors. I know the test is quite large, the actual dcopy is in a quantum chemistry code. However, even a 32-bit integer dcopy should handle 1.2 billion elements since it's a 64-bit library, so size_t can hold the number of bytes without a problem, as it did in the older version. Since it only fails on opteron, I guess it's somewhere in an architecture-specific routine.
Gennady_F_Intel
Moderator
151 Views

Thanks Steven. I am asking just to know what we need to check on our side. Yes, 1.2 10^9 should be handled by 32bit integer. one more question - did you link with libmkl_sequential.a or libmkl_gnu_thread.a ?
Steven_V_
Beginner
151 Views

I linked with libmkl_sequential.a. I just tried the libmkl_gnu_thread.a now and it fails too.
Gennady_F_Intel
Moderator
151 Views

just for info: I am still couldn't find Opteron with such memory size. the test passed on Xeon with 32 Gb od RAM. ./test_dcopy 1097.52614198074 1097.52614198074
Steven_V_
Beginner
151 Views

thanks for the info, this is also exactly what I get on Xeon E5630: ./test_dcopy 1097.52614198074 1097.52614198074 and this is what I get on Opteron 6276: ./test_dcopy 1097.52614198074 wrong, I = 146738497 X(I) = 3.141589835286140E-002 Y(I) = 1.12300002574921 Could you run the Opteron-specific code on a Xeon, or is that impossible? FYI, this is the discussion about the same issue, but with ACML: http://devgurus.amd.com/thread/159788
Steven_V_
Beginner
151 Views

if necessary, I could try to find out how to give you access to one of our machines.
Gennady_F_Intel
Moderator
151 Views

thanks for suggestion, it's not necessarily - I have already received the same results on AMD Opteron(tm) Processor 6282 SE with 32 Gb of RAM. We will check what's wrong.
barragan_villanueva_
Valued Contributor I
151 Views

Steven, You wrote that this problem is on AMD machines with MKL and ACML. Could you check this test on Netlib?
Steven_V_
Beginner
151 Views

Hi Victor, with my system's blas (Ubuntu 12.04) it works fine, and I think that this is based on the netlib implementation. Also, our own program's blas is also based on netlib and that works fine too. (The older MKL 10.1 and ACML 4.2.0 work fine too). But if necessary, I can compile the netlib dcopy and test it.
Gennady_F_Intel
Moderator
151 Views

Steven, The problem is reproduced even when only is used DCOPY. I commented SQRT and DDOT and the problem is still exists: wrong, I = 146738497 X(I) = 3.141589835286140E-002 Y(I) = 1.12300002574921 the problem is escalated. we will let you know as soon as any update.
Gennady_F_Intel
Moderator
151 Views

Hello Steven,

Would you please check the latest 11.0. update2? the problem has been fixed there. 

--Gennady

Steven_V_
Beginner
151 Views

Hi,

I've installed version 11.0 update 2 of MKL and compiled my test program, linking with either -lmkl_sequential or -lmkl_gnu_thread, but in both cases the program segfaults, output from gdb attached.

Vamsi_S_Intel
Employee
151 Views

Hi Steve,

Looking at the gdb bt log you attched, I notice that you are using the MKL ILP64 interface library (libmkl_gf_ilp64.so). When using the ilp64 interface library, the integers declared in the source program should be of 64-bit integer type. With gfortran, the relevant compiler flag which make sures integers are 64-bit length is -fdefault-integer-8. For Intel Fortran compiler, the correct option is -i8

Can you confirm that you are using -fdefault-integer-8 when compiling your program with gfortran and linking against the MKL gfortran ilp64 interface library?

--Vamsi.

Steven_V_
Beginner
151 Views

Hi Vamsi,

thanks for catching that, I was too quick to test and indeed forgot that flag. Everything seems to work fine now, also in production the results match the Xeon machines. Thanks to everyone for your help.

greetings,

Steven

Reply