Is there a user-facing way to get a more useful diagnostic than "internal compiler error"?
Background is we're getting random ICEs. Sometimes the code compiles just fine, sometimes it doesn't, with failures at different places in the code. It's a large code base that isn't share-able, and the random ICEs persist across different machines/architectures and compiler versions. The code compiles just fine using GCC.
Currently we're just automating re-compiling until one completes without an ICE, but that's obviously somewhat less than ideal. Given that the code can compile without an ICE it suggests to me that something might be wrong or misconfigured with our systems, but random "ICE, please report this" messages with no further information are pretty worthless in terms of figuring out what might be going wrong.
I think you are saying that you get non-deterministic ICEs with the same compiler option and sources? An ICE should be deterministic, assuming you keep re-running with the same compiler options on the same sources.
On the server where you see this, how is /tmp space? One way a compiler can ICE non-deterministically is when there is not enough scratch space for the temp files. OR if the /tmp space is not following POSIX flock() correctly.
If you are compiling with make/cmake and doing a parallel build with --jobs, what happens when you set --jobs 1 ? Does it go away?
What compiler version, OS distro and version?
Again, given the same source and same options, the ICE or non-ICE should be deterministic.
It's unfortunately not deterministic in any controllable way; the exact same build can work one time and fail the next.
I have managed to find the culprit, if not the root cause: "-warn interfaces" combined with a parallel build.
Turning off interface warnings, combining interface warnings with "-nogen-interfaces", or building in serial appears to fix things (being a pseudo random thing I can't really be 100% sure, but at the very least the frequency of problems goes way down). Not immediately sure if that could be due to a problem on our end or not.
Also noticed that the documentation might have a bug: https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/compiler-reference/compiler-options/compiler-diagnostic-options/warn.html
The "warn interfaces" part of the table states "-no-gen-interfaces", which I think should be "-nogen-interfaces".
1) Temp space seems fine.
2) Has been seen on both REHL and Rocky Linux, with various flavors of ifort 18 and 19. Unfortunately due to restrictions beyond our control, even if a newer compiler fixed things we'd be unable to migrate at this time.
"-warn interfaces" turns on "-gen-interfaces". That causes two files to be created: .f90 and .mod. The .f90 file may not be complete.
I wonder if with the parallel build you're clobbering some files.