Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

ifx: ICE and SEGFAULT

Diehl__Martin
Novice
1,646 Views

Dear Intel Team,

 

I've found a few glitches in the new ifx compiler. The code compiles fine with gfortran and ifort. I still have to see whether I can work out the MWE. The errors can be reproduced by compiling DAMASK (https://github.com/eisenforschung/damask). If someone is interested, I can share detailed build instructions.

 

1) Segmentation fault when finalizing a nested linked list. Error messages are

with optimization

 

0x000015554fca8020 in process_allocation_records_deallocate () from /home/m/intel/oneapi/compiler/2023.0.0/linux/compiler/lib/intel64_lin/libifcoremt.so.5

 

 

without optimization

 

0x000015554fca3fe9 in for.calc_num_elts () from /home/m/intel/oneapi/compiler/2023.0.0/linux/compiler/lib/intel64_lin/libifcoremt.so.5

 

traceback does not give extra information, probably because the segmentation fault happens in libifcoremt. Note: the error only occurs when enabling openMP, but the executed code is not in a parallel region.

 

2) Internal compiler error during the debug build. I don't think it's related to the other behavior, but it makes it impossible to compile with the whole set of debug options (-stand f18 -assume nostd_mod_proc_name -O0 -fpp -no-ftz -diag-disable 5268,7624 -warn declarations,general,usage,interfaces,ignore_loc,alignments,unused -DDEBUG -g -traceback -gen-interfaces -fp-stack-check -fp-model strict -check bounds,format,output_conversion,pointers,uninit -ftrapuv -fpe-all=0 -ftz -debug-parameters all -debug all)

 

 


#0 0x0000000001c55e52 (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x1c55e52)
#1 0x0000000001c55f80 (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x1c55f80)
#2 0x000015389d967f50 (/usr/lib/libc.so.6+0x38f50)
#3 0x000000000288f3c5 (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x288f3c5)
#4 0x0000000002c97b94 (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x2c97b94)
#5 0x0000000002d1becd (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x2d1becd)
#6 0x000000000202fe87 (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x202fe87)
#7 0x0000000001b96f59 (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x1b96f59)
#8 0x0000000001b955dd (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x1b955dd)
#9 0x0000000001b44a74 (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x1b44a74)
#10 0x0000000001d028ee (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x1d028ee)
#11 0x000015389d952790 (/usr/lib/libc.so.6+0x23790)
#12 0x000015389d95284a __libc_start_main (/usr/lib/libc.so.6+0x2384a)
#13 0x0000000001982da9 (/home/m/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/xfortcom+0x1982da9)

/tmp/ifx0762407692YaNe58/ifxWdR7CS.i90: error #5633: **Internal compiler error: segmentation violation signal raised** Please report this error along with the circumstances in which it occurred in a Software Problem Report. Note: File and line given may not be explicit cause of this error.
compilation aborted for /home/m/DAMASK/src/math.f90 (code 3)
make[3]: *** [src/CMakeFiles/DAMASK_grid.dir/build.make:388: src/CMakeFiles/DAMASK_grid.dir/math.f90.o] Error 3
make[2]: *** [CMakeFiles/Makefile2:98: src/CMakeFiles/DAMASK_grid.dir/all] Error 2
make[1]: *** [Makefile:136: all] Error 2
make[1]: Leaving directory '/home/m/DAMASK/build/grid'
make: *** [Makefile:13: grid] Error 2

 

 

0 Kudos
1 Solution
Ron_Green
Moderator
1,524 Views

The option -ftrapuv is the source of the ICE.  I'm not sure if we have fully implemented the FP exception handling in IFX.  I will check into this. For sure it gives me something to work with.

View solution in original post

0 Kudos
20 Replies
Ron_Green
Moderator
1,577 Views

Let's start with the ICE first. I've cloned the repo. Send build instructions to get the ICE.


0 Kudos
Diehl__Martin
Novice
1,554 Views

thanks.

the following bash script should work:

 

source ~/intel/oneapi/setvars.sh                                                                    
                                                                                                    
wget https://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-3.18.5.tar.gz                        
tar -xf petsc-3.18.5.tar.gz                                                                         
cd petsc-3.18.5                                                                                     
export PETSC_DIR=$PWD                                                                               
export PETSC_ARCH=oneapi                                                                            
sed -i "1719s/if not os.path.isfile(os.path.join(self.packageDir,'configure')):/if True:/g" config/BuildSystem/config/package.py
./configure \                                                                                       
          --with-fc='mpiifort -fc=ifx' \                                                            
          --with-cc='mpiicc -cc=icx' \                                                              
          --with-cxx='mpiicpc -cxx=icpx' \                                                          
          --download-fftw --download-hdf5 --download-hdf5-fortran-bindings=1 --download-zlib        
make all                                                                                            
cd ..                                                                                               
                                                                                                    
git clone git@github.com:eisenforschung/DAMASK.git                                                  
cd DAMASK                                                                                           
make grid BUILD_TYPE=DEBUG 
0 Kudos
Ron_Green
Moderator
1,531 Views

I have it building and ICE'ing on math.f90.   It took a while, a lot of manual intervention.  Our intranet does not allow ftp, nor wget to sites with 'ftp' in the name.

Also, I started on a server with an older autoconf and that was tripping up a lot of the petsc build.

 

I'll get working on the ICE.  

0 Kudos
Ron_Green
Moderator
1,525 Views

The option -ftrapuv is the source of the ICE.  I'm not sure if we have fully implemented the FP exception handling in IFX.  I will check into this. For sure it gives me something to work with.

0 Kudos
Steve_Lionel
Black Belt
1,523 Views

-ftrapuv never did anything useful in ifort - I lobbied (unsuccessfully) to get it deprecated.

0 Kudos
Diehl__Martin
Novice
1,477 Views

with a working PETSc installation it is also trivial to trigger the runtime error: Build DAMASK in release mode and run the example. That means, the last lines are

make clean grid
cd examples/grid
../../bin/DAMASK_grid -l tensionX.yaml -g 20grains16x16x16.vti
0 Kudos
Ron_Green
Moderator
1,389 Views

@Diehl__Martin  The bug ID is CMPLRLLVM-46264.

The -fptrapuv turned out to be interesting.  one of the things this options does is to initial new memory to signaling NaNs.

You will find the same Internal Compiler Error (ICE) if you use

   -init=snan

 

this in conjunction with a named block seems to be the trigger.  I was able to reduce this down to the following simple reproducer:

 

 

subroutine selfTest()


  normal_distribution: block
 
    real(4), dimension(:), allocatable :: r
    real(4) :: mu, sigma

    if (.true.) &
      error stop 'math_normal(sigma)'
  end block normal_distribution

end subroutine selfTest

 

No PetSCI or MPI or module or .... just 

 

ifx -c -O0 -init=snan math.f90

 

  We missed the upcoming Update 1 code freeze, so we're looking at Update 2 for a fix roughly early to mid-summer.   If you avoid -fptrapuv or -init=snan (or any other value) you can avoid this error.

So what do I need to look at next for DAMASK if we build without these options?

0 Kudos
Steve_Lionel
Black Belt
1,382 Views

@Ron_Green wrote:

 

The -fptrapuv turned out to be interesting.  one of the things this options does is to initial new memory to signaling NaNs.


Interesting. When I was at Intel, I complained that -ftrapuv didn't initialize to a NaN, but to some weird pattern (I think it was hex 80808080). The documentation was changed to say, "Initializes stack local variables to an unusual value to aid error detection.", which it still says. But in current ifort it really does initialize to a signaling NaN, 7FBADDAD(!). 

The documentation does say that this option is ifort-only. ifx accepts it without complaint but ignores it.

0 Kudos
Diehl__Martin
Novice
1,259 Views

thanks!

From my side, there are no further issues related to DAMASK.

0 Kudos
Ron_Green
Moderator
1,356 Views

-ftrapuv is implemented in IFX.

The Dev Guide documentation is out of date.   I'll get that fixed

 

We implemented -ftrapuv recently.  But your point about use of ftrapuv still holds.  this is a somewhat dangerous option.  Floating point speculation can cause FP exceptions, and appear in places in your code that make no sense.  Be sure to read THIS CAUTIONARY WRITEUP if you are considering -ftrapuv.  In short, only use O0 if you use this option, don't override O0 with O1, O2, O3.  

 

Here is a Linux example of use of ftrapuv with ifx 2023.0.0

module calcu

contains

 subroutine calc(ans)
  implicit none
   real :: ans
   real :: no_value

   ans = ans / no_value
 end subroutine calc

end module calcu

program trapme
use calcu
real :: a=42.0

call calc(a)
write(*,*) "a is ", a
end program trapme

and the runtime result

ifx -V -ftrapuv -g -traceback -o ftrapuv ftrapuv.f90 
Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2023.0.0 Build 20221201
Copyright (C) 1985-2022 Intel Corporation. All rights reserved.

ifx: remark #10440: Note that use of a debug option without any optimization-level option will turnoff most compiler optimizations similar to use of '-O0'
 Intel(R) Fortran 23.0-1198
GNU ld version 2.37-37.fc36

$ ./ftrapuv
forrtl: error (182): floating invalid - possible uninitialized real/complex variable.
Image              PC                Routine            Line        Source             
libc.so.6          00007F1F2183EA30  Unknown               Unknown  Unknown
ftrapuv            00000000004051E2  calc                       10  ftrapuv.f90
ftrapuv            0000000000405216  trapme                     19  ftrapuv.f90
ftrapuv            000000000040519D  Unknown               Unknown  Unknown
libc.so.6          00007F1F21829510  Unknown               Unknown  Unknown
libc.so.6          00007F1F218295C9  __libc_start_main     Unknown  Unknown
ftrapuv            00000000004050B5  Unknown               Unknown  Unknown
Aborted (core dumped)

 

0 Kudos
Steve_Lionel
Black Belt
1,342 Views

I tried -ftrapuv in ifx and the real variable got initialized to zero, which is why I thought it was ignored.

0 Kudos
Ron_Green
Moderator
1,341 Views

did you test this on Windows?  I haven't tried Windows and wonder if it's different or broken on Windows.

 

I put in a Documentation bug for the ftrapuv option.

0 Kudos
Steve_Lionel
Black Belt
1,325 Views

Yes, Windows. But I modified the program a bit and now I get the same SNaN. Hmm. I am also seeing:

D:\Projects>ifx /Qtrapuv t.f90
Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2023.0.0 Build 20221201
Copyright (C) 1985-2022 Intel Corporation. All rights reserved.

ifx: remark #10440: Note that use of a debug option without any optimization-level option will turnoff most compiler optimizations similar to use of '/Od'
0 Kudos
Ron_Green
Moderator
1,322 Views

<sigh> FTRAPUV turns off the optimization and turns on debug.  As you know, by default we set O2.  FTRAPUV resets to O0. A debug option without any explicit -O option causes this warning.    Since we used FTRAPUV, we implicitly get debug but we do not have an explicit -O option.


I strongly protested when they did this.  Some noisy Intel field engineer complained that he didn't know -g or getting debug would turn off optimization unless he explicitly added a -O[1-3] option.  IMHO compiler options 101 that this chap apparently never learned.  And a few of his fellow field engineer types also claimed ignorance.  Thus some management types demanded that we warn people that -g or windows equivalents turn off optimization and SHOULD be followed by a -O option. 

0 Kudos
Steve_Lionel
Black Belt
1,303 Views

And yet ifort doesn't do this....  I did not see the message when I tried ifx earlier - strange.

I have always found -ftrapuv to be a mysterious option - it seems to have additional effects beyond initialization. Or is it now just an alias of -init=snan, which doesn't give this warning in ifx?

0 Kudos
andreasskeidsvoll
1,169 Views

Hi,


I would just want to chime in and say that I've encountered segfaults related to finalization at several places in a development version of the eT program compiled with ifx 2023.1 and -qopenmp, breaking almost all of our end-to-end tests. From what I can see from gdb, the segfault always seems to occur in for.calc_num_elts, called by for_finalize, and the issue disappears when I compile without the -qopenmp flag. I have not been able to create a MWE yet, but I could prepare a branch of the eT program if anyone wants to reproduce the segfaults.

0 Kudos
Ron_Green
Moderator
700 Views

this bug is fixed in the 2024.0 release.


0 Kudos
gumle_maasegg
Beginner
209 Views

Hi,

the attached code does not compile. How can I know more about what goes wrong ? At the moment I do not have access to the 2024 version.

saue@lcpq-ampere:~/Dirac/src/amfi> ifx --version
ifx (IFX) 2023.1.0 20230320
Copyright (C) 1985-2023 Intel Corporation. All rights reserved.

saue@lcpq-ampere:~/Dirac/src/amfi> ifx amfi.f
#0 0x0000000001f63112
#1 0x0000000001fc5727
#2 0x0000000001fc5850
#3 0x00007f153d6dadc0
#4 0x0000000003a3a098
#5 0x0000000003a39e9c
#6 0x0000000002a87afb
#7 0x0000000002a87d0e
#8 0x0000000002a87999
#9 0x0000000002a841c3
#10 0x0000000002a83e51
#11 0x0000000002a86679
#12 0x0000000003a3d004
#13 0x0000000003a3c794
#14 0x0000000003a36310
#15 0x0000000003a36c4f
#16 0x0000000003a35c4e
#17 0x0000000003a35457
#18 0x0000000002e90f5d
#19 0x00000000022ebeec
#20 0x0000000002e84fbd
#21 0x00000000022f30b7
#22 0x0000000002e8519d
#23 0x00000000022eaa8a
#24 0x0000000001f08104
#25 0x0000000001f06b83
#26 0x0000000001eb5859
#27 0x00000000020797c5
#28 0x00007f153d6c524d __libc_start_main + 239
#29 0x0000000001cf1729

amfi.f: error #5633: **Internal compiler error: segmentation violation signal raised** Please report this error along with the circumstances in which it occurred in a Software Problem Report. Note: File and line given may not be explicit cause of this error.
compilation aborted for amfi.f (code 3)

0 Kudos
Ron_Green
Moderator
191 Views

Perhaps I should have been more clear - this bug is fixed FIRST in version 2024.0.  All older versions have this bug.  You need to upgrade to 2024.0 (latest)

Steve_Lionel
Black Belt
192 Views

It compiles OK for me using version 2024.0.2. Internal compiler errors are always compiler bugs, and it seems that whatever this one was, it got fixed. Sometimes one can identify a compiler option that triggers the error, but you used none.

Reply