- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using code that worked with previous versions (Intel 12.1 OpenMPI 1.4.4 MKL 10.2.5.035 on CentOS 6) of MKL, zgetrf/zgetri cannot invert a matrix that it previously inverted.
Could the newer intel modules be changing how the arguments are accepted for lapack functions?
The pivot index is determined by the output of zgetrf( .. , ipvt , ..) which is then fed into zgetri as input similarly. I believe this is the way to do it.
I have just tested a sample program that only does the zgetrf and zgetri portions and I believe now that this is where the problem lies. I have tested two sample programs, one with a relevant 8x8 matrix that I calculate in my program, and one made-up example 8x8 matrix that I checked has an inverse.
Both programs run fine using the old modules, compiling with mpif90 etc as I do now when I am taking measurements. I get no runtime errors or error outputs from the zgetrf and zgetri subroutines. However, when using the new modules (intel2018), I do get an error output from the zgetri and zgetrf subroutines for both sample matrices. This error output, labeled "INFO" by default, rerturn a value of 1. To save you some searching, INFO should return 0 if successful (zgetrf factorizes the matrix to be inverted, zgetri actually inverts it). However if INFO = i > 0, then U(i,i) = 0 and the matrix is singular. This means that something that I'm doing with the new modules is causing a verifiable invertible matrix to be considered non-invertible, or singular.
Now, how this causes a segfault in the program at large I'm not 100% sure, it doesn't do so in my small sample program. However it seems very likely that since I do not suppress the output of the subroutines when it returns something other than a successful exit value for INFO, that the output of zgetri in this case is garbage that eventually causes some segfault error when it eventually gets multiplied by other matrices and is solved by zgetrs subroutine later on. I can verify that the output of zgetri when INFO=1 is nonsense from my small tests.
So to recap, I have found that using the new modules causes zgetri and zgetrf subroutines to return an error for the same, invertible matrices. This problem may also occur for the subroutine zgetrs, but I haven't tested it. In any case, the return value after the error then probably causes the segfault. The question remains, why is zgetrf/zgetri not working with the new modules?
Here is sample program (pgrmcheck_cond_zgetri.f90) which reproduces the problem:
! placeholder
!mpiifort -o pgrmcheck.x -O3 pgrmcheck_cond_zgetrf.f90 -i8
!-I${MKLROOT}/include/intel64/ilp64 -I${MKLROOT}/include
!${MKLROOT}/lib/intel64/libmkl_lapack95_ilp64.a -L${MKLROOT}/lib/intel64
!-lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core
!-lmkl_blacs_intelmpi_ilp64 -liomp5 -lpthread -lm -ldl
PROGRAM dyncond_check
IMPLICIT NONE
include 'mpif.h'
complex,DIMENSION(8) :: rightvec,work
integer,dimension(8) ::ipvt
complex,DIMENSION(8,8) :: tmatrixN1
integer:: error1,error2,i,j
tmatrixN1=0
do i=1,8
do j=1,8
if (j<(i+1)) tmatrixN1(i,j)=i+(j-1)*8
enddo
enddo
call mkl_set_num_threads(1)
call zgetrf(8,8,tmatrixN1,8,ipvt,error1)
print *,error1
call zgetri(8,tmatrixN1,8,ipvt,work,8,error2)
print *,error2
do i=1,8
do j=1,8
! print *,tmatrixN1(i,j)
enddo
enddo
end program
Compiled with:
### Command to compile for intel 2018 modules (Not functioning) ###
module load intel/cluster/2018
mpiifort -o pgrmcheck2.x -O3 pgrmcheck_cond_zgetri.f90 -i8 -I${MKLROOT}/include/intel64/ilp64 -I${MKLROOT}/include ${MKLROOT}/lib/intel64/libmkl_lapack95_ilp64.a -L${MKLROOT}/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_ilp64 -liomp5 -lpthread -lm -ldl
./pgrmcheck2.x
###
### Command to compile for (old) intel 12.1 modules (functioning) ###
module load intel/12.1 ompi/1.4.4/intel mkl/10.2.5.035
mpif90 -o pgrmcheck2.x -O3 -r8 pgrmcheck_cond_zgetri.f90 -L/soft/intel/mkl/10.2.1.017/lib/em64t -lmkl_lapack -lmkl_intel_thread -lmkl_core -lguide -lmkl_intel_lp64
./pgrmcheck2.x
###
Thanks!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
wrt MKL version 2018: do you see segfault with lp64 API too? Could you please check.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I replace all instances of 'ilp64' to plainly 'lp64', It did compile and there were no segfault errors, but all my results return NaN.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you need more info?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I checked the latest MKL 2018 u3 with lp64 and ilp64 aPI. I only added to this code mkl_version subroutine call.
here are output I see on my side:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just to verify, the info returned (the 1's) indicate failure, since 0 should be returned if it was successful, correct?
If so are you looking into this problem?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes, this looks like a bug.could you please submit the request to the Inlel online service center -https://supporttickets.intel.com/servicecenter

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page