Community
cancel
Showing results for 
Search instead for 
Did you mean: 
nooj
Beginner
79 Views

dsyevr crashes when given non-identity 3x3 matrix

Hi,

I'm getting a crash in DSYEVR when I pass a non-identity 3x3 matrix (see below). I get no crash if I change compiler flags from '-O3 -static-intel -unroll -ftrapuv -automatic' to '-O0 -g' (and retain '-r4 -r8 -implicitnone -C -check nooutput_conversion -gen-interfaces -warn interfaces -traceback -fscomp logicals -align'). If I pass the identity 3x3 matrix with 'bad' flags, I get no crash.

forrtl: error (65): floating invalid
Image PC Routine Line Source
libmkl_lapack.dyl 000000010A8905A4 Unknown Unknown Unknown
libmkl_lapack.dyl 000000010A456804 Unknown Unknown Unknown
libmkl_lapack.dyl 000000010A4B0D99 Unknown Unknown Unknown
libmkl_lapack.dyl 000000010A4BB714 Unknown Unknown Unknown
libmkl_intel_thre 0000000100864647 Unknown Unknown Unknown
libmkl_intel_lp64 00000001005462F7 Unknown Unknown Unknown
a.opt 000000010000F044 _fib_dir_module_m 58 current_fib_dirs.f
a.opt 00000001000AC34F _intelmass_3d_ 447 IntElmAss_3D.f
a.opt 00000001000DFDE6 _solflowgmres_ 95 solflow.f
a.opt 000000010001BD13 _MAIN__ 515 driver.f
a.opt 0000000100000B6C Unknown Unknown Unknown
a.opt 0000000100000B04 Unknown Unknown Unknown

ifort --version
ifort (IFORT) 11.1 20100401
Copyright (C) 1985-2010 Intel Corporation. All rights reserved.

echo $MKLROOT
/opt/intel/Compiler/11.1/088/Frameworks/mkl

about this mac:
OSX version 10.5.8
2.93 GHz Quad-Core Intel Xeon

the call in question is this:

! SYEVR variables:
real*8 :: A(NSD,NSD) ! temp copy of B
real*8 :: Z(NSD,NSD) ! matrix of eigenvectors
character(len=1), parameter :: jobz = 'V' ! 'V' => compute evects
character(len=1), parameter :: range = 'A'! 'A' => compute all evals
character(len=1), parameter :: uplo = 'U' ! see usage below
integer, parameter :: lwork = 26*NSD ! size of work array
integer, parameter :: liwork = 10*NSD ! size of iwork array
real*8 work(lwork) ! workspace array
integer iwork(liwork) ! workspace array
integer info ! error code
integer m, isuppz(2*NSD) ! see usage below
real*8 vl, vu
integer il, iu

call DSYEVR(
& jobz, ! intent(in): jobz; V => compute eigenvectors
& range, ! intent(in): range; A => compute all eigs
& uplo, ! intent(in): uplo; which triangle holds elems of A
& NSD, ! intent(in): n; size of A
& A, ! intent(inout): input array, non-uplo triangle overwritten
& NSD, ! intent(in): lda
& vl,vu, ! intent(in): vl,vu; not referenced (since range='A')
& il,iu, ! intent(in): il,iu; not referenced (since range='A')
& 0d+0, ! intent(in): abstol; 0 => use default min tolerance
& m, ! intent(out): num eigenvalues found
& evalues,! intent(out): eigenvalues are put here
& Z, ! intent(out): eigenvectors are put here
& NSD, ! intent(in): ldz; first dimen of z
& isuppz, ! intent(out): indices of support of eigenvects in z
& work, ! intent(out): workspace array
& lwork, ! intent(in): dimension of work array, >= 26*n
& iwork, ! intent(out): workspace array
& liwork, ! intent(in): dimension of iwork array, >= 10*n
& info ! intent(out): error code
&)

FAILING compile options are this:

mac_desktop_opt_FCFLAGS =\\
-r8 -i4 -implicitnone \\
-C -check nooutput_conversion \\
-gen-interfaces -warn interfaces \\
-traceback \\
-fpscomp logicals \\
-align \\
-O3 -static-intel \\
-unroll \\
-ftrapuv \\
-automatic

FAILING link options:
(${NOOJ_MKL_xxx} points into the 11.1.08 mkl framework.)
($(OBJ_DIR) is the local object directory.)
-B${NOOJ_MKL_DIR} \\
-I${NOOJ_MKL_INC} -I$(OBJ_DIR) \\
-L${NOOJ_MKL_LIB} -L$(OBJ_DIR) -L/lib \\
-lguide \\
-lpthread \\
-lmkl_solver_lp64 \\
-lmkl_intel_lp64 \\
-lmkl_intel_thread \\
-lmkl_sequential \\
-lmkl_core \\
$(OBJ_DIR)/LIB_VTK_IO.a \\

WORKING compile options:

mac_desktop_valgrind_FCFLAGS =\\
-r8 -i4 -implicitnone \\
-C -check nooutput_conversion \\
-gen-interfaces -warn interfaces \\
-traceback \\
-fpscomp logicals \\
-align \\
-O0 -g

WORKING link options:
(same as FAILING link options above, with -g added)

when i tested with ifort v 11.1.058, the crash would occur in random weird places. with 11.1.088 it seems to crash consistently in the spot i quoted above.

any ideas?

i have also done testing on ifort v 11.1.069 on linux using valgrind.
i can post those errors. it says something about
'invalid read of size 8' within /lib/ld-2.7.so
by ...
by dlopen (in /lib/libdl-2.7.so)
by mkl_serv_load_lapack_dll (in /opt/intel/Compiler/11.1/069/mkl/lib/em64t/libmkl_core.so),
and then
'Warning: client switching stacks?'
'Can't extend stact to 0x7FD614F40 during signal delivery for thread 1: no stack segment'
and then my process terminates with SIGSEGV.
0 Kudos
6 Replies
nooj
Beginner
79 Views

BTW, the culprit flag is -ftrapuv.
mecej4
Black Belt
79 Views

You are calling a routine with the Fortran-77 calling conventions for passing 2-D arrays.

Specifically, you declare A(NSD,NSD), with NSD=26, but only use the leading 3 X 3 submatrix.

The fourth argument in the call to DSYEVR should be the actual size of A, not the declared size of A, and the latter is already being passed as 'lda', the sixth argument. Since only the 3 X 3 part (9 elements) have been assigned values, the remaining (26 X 26 - 9) elements are undefined and, with -ftrapuv selected, you will get an abort for using undefined variables. That makes -ftrapuv actually a "good" option, unless you want to hide errors!

Not realizing the real error, you have been trying out various ways to cure the problem. The simple fix is to pass the correct fourth argument as follows:

integer :: n
..
n = 3
..
call DSYEVR(
& jobz, ! intent(in): jobz; V => compute eigenvectors
& range, ! intent(in): range; A => compute all eigs
& uplo, ! intent(in): uplo; which triangle holds elems of A
& n, ! intent(in): n; size of A <<<< This is where you made the mistake of using NSD in place of n >>>>
& A, ! intent(inout): input array, non-uplo triangle overwritten
& NSD, ! intent(in): lda <<<< This is correct; lda stands for "leading declared dimension of A" >>>>>
..
..
nooj
Beginner
79 Views

mecej4: nice to hear from you.
> Specifically, you declare A(NSD,NSD), with NSD=26,
> but only use the leading 3 X 3 submatrix.

Sorry, I forgot to mention that NSD ("number of spatial dimensions") is 3.
Given that, I think your arguments are the same as mine, and we are both correct.
-f
mecej4
Black Belt
79 Views

My oversight. With NSD=3, with the following test program, I get correct results on Linux-x64 with ifort -mkl:

I tried a few combinations of other compiler switches, but with no change in the results. Please show an example where DSYEVR gives incorrect results, or report a combination of compiler switches for which my example fails. We can take it from there.

[fxfortran]      program nooj
      implicit none
      integer, parameter :: NSD=3
            ! SYEVR variables: 
      real*8 :: A(NSD,NSD)                      ! temp copy of B
      real*8 :: Z(NSD,NSD)                      ! matrix of eigenvectors
      character(len=1), parameter :: jobz = 'V' ! 'V' => compute evects
      character(len=1), parameter :: range = 'A'! 'A' => compute all evals
      character(len=1), parameter :: uplo = 'U' ! see usage below
      integer, parameter :: lwork = 26*NSD      ! size of work array
      integer, parameter :: liwork = 10*NSD     ! size of iwork array
      real*8 work(lwork)                        ! workspace array
      integer iwork(liwork)                     ! workspace array
      integer info                              ! error code
      integer m, isuppz(2*NSD)                  ! see usage below
      real*8 vl, vu, evalues(NSD)
      integer il, iu, i,j
      data A/4d0,3*-1d0,4d0,3*-1d0,4d0/
 
      call DSYEVR(
     &  jobz,   ! intent(in): jobz;  V => compute eigenvectors
     &  range,  ! intent(in): range; A => compute all eigs
     &  uplo,   ! intent(in): uplo;  which triangle holds elems of A
     &  NSD,    ! intent(in): n; size of A
     &  A,   ! intent(inout): input array, non-uplo triangle overwritten
     &  NSD,    ! intent(in): lda
     &  vl,vu,  ! intent(in): vl,vu; not referenced (since range='A')
     &  il,iu,  ! intent(in): il,iu; not referenced (since range='A')
     &  0d+0,   ! intent(in):  abstol; 0 => use default min tolerance
     &  m,      ! intent(out): num eigenvalues found
     &  evalues,! intent(out): eigenvalues are put here
     &  Z,      ! intent(out): eigenvectors are put here
     &  NSD,    ! intent(in):  ldz; first dimen of z
     &  isuppz, ! intent(out): indices of support of eigenvects in z
     &  work,   ! intent(out): workspace array 
     &  lwork,  ! intent(in):  dimension of work array, >= 26*n
     &  iwork,  ! intent(out): workspace array
     &  liwork, ! intent(in):  dimension of iwork array, >= 10*n
     &  info    ! intent(out): error code
     &)
      if(info.eq.0)then
        write(*,*)' Eigenvalues  : '
        write(*,10)evalues(1:3)
        write(*,*)' Eigenvectors : '
        write(*,20)((Z(i,j),j=1,3),i=1,3)
      else
        write(*,*)' Info = ',info
      endif
   10 format(1x,3G14.4)
   20 format(1x,3G14.4)
      end program nooj[/fxfortran]
RESULTS:

[bash]~/LANG> ifort -mkl -r8 -i4 -implicitnone -C -check nooutput_conversion -traceback -fpscomp logicals -align -O3 -static-intel -unroll -ftrapuv -automatic nooj.f
~/LANG> ./a.out
  Eigenvalues  :
      2.000         5.000         5.000
  Eigenvectors :
    -0.5774       -0.4082        0.7071
    -0.5774       -0.4082       -0.7071
    -0.5774        0.8165         0.000
[/bash]

Some of the switches that you have chosen are contradictory -- for example, -C and -O3. There has to be some reason why you chose them, but I cannot guess it.
nooj
Beginner
79 Views

-O3 is to give the compiler a fighting chance at optimization when it can, and
-C because this is a research code and I don't trust myself any further than I can throw myself.
I leave -C off when small changes in input data for well-tested scenarios.
Thanks, I will look at your program this week!
- Nooj
nooj
Beginner
79 Views

I have not been able to find a situation in which your program crashes. As I am able, I will continue to investigate why I am able to use this same code snippet in my main code and cause a crash within the MKL library.

Thanks for looking into this!
- Nooj
Reply