Compilation of f77 with intel2016: runs but all 0

Christophe_O_ · ‎03-09-2017

Dear programmers,

In principle I develop my codes with C++/Python but today I am one of those lucky guys that must use a legacy code in fortran 77 parallelized with MPI. I have been using this code for years and it has been compiled and worked with different compilers, among which, intel 10.

Now we have installed the intel 2016 and it no longer works. It compiles successfully, runs...but the result is unexpected. All the data are 0.

Does it sound familiar to you? Do you know any compilation option to comply with the f77 standard and that could help me detect any error due to old format (for instance)?

I thank you very much in advance for any help.

Christophe

mecej4 · ‎03-09-2017

I suggest that you put MPI, etc. aside and get the code to work in serial mode first. Modern Fortran is almost 100 percent backwards compatible, so the old code should be possible to run. However, the code may depend on some non-standard but common conventions such as variables being saved and/or initialized to zero at program start. Please provide some details on the F77 code, or provide the code itself along with input data and expected output.

Christophe_O_ · ‎03-09-2017

Sorry, impossible to provide the code. Too big (thousands of lines) and also for ownership reasons. It is released by the Nuclear Energy Agency.

What I do not understand is why it works with intel 10 but with later version, it never worked.

I have the code working in serial mode. Maybe I can try and see if it produces also all these 0.

Kevin_D_Intel · ‎03-09-2017

Consistency of Floating-Point Results using the Intel® Compiler discusses compiler options and may help explain some differences.

jimdempseyatthecove · ‎03-09-2017

What mecej4 is suggesting is

Run the executable directly from the command line as opposed to mpiexec, mpirun, ...

IOW run as 1 Rank.

If this program produces incorrect results (all the data are 0), then the issue resides within Fortran. Most likely:

mecej4>> However, the code may depend on some non-standard but common conventions such as variables being saved and/or initialized to zero at program start.

If the program produces correct results, then you may have an MPI issue such as on MPI error your program exits without notifying your as to an abnormal exit (IOW your pre-wiped results data is returned wiped).

Jim Dempsey

Christophe_O_ · ‎03-10-2017

Dear Jim,

I did what you and mecej4 suggested. I installed the code without MPI and ran it in sequential mode. Same result, all 0. So it is an issue with Intel Fortran, not MPI.

Dear Kevin,

I clicked on the link you posted but it is not found. Could you please send me the full link ? Thanks in advance.

Christophe

Kevin_D_Intel · ‎03-10-2017

Here you go: https://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler

mecej4 · ‎03-10-2017

Christophe O. wrote:
I did what you and mecej4 suggested. I installed the code without MPI and ran it in sequential mode. Same result, all 0. So it is an issue with Intel Fortran, not MPI.

That is actually good news, because it lets us remove a couple of irrelevant complications. Furthermore, your finding that the results are all 0 is also heartening, because it almost eliminates considerations of numerical precision/reproducibility.

I think that the problem is likely to be quite simple in nature. However, you will need to do one of two things (i) construct a reproducer that captures the problem but without the complexity, (ii) whittle down your NEA code such that what is left is not subject to restrictions on release, while preserving the 0-s in the output.

You can also "burn the candle from both ends" and use a combination of (i) and (ii).

jimdempseyatthecove · ‎03-10-2017

When you experience problems as described, it almost always ends up as being a latent bug in the program that didn't show up until now.

You may reduce the debugging effort by employing the technique of "peeling the onion". Modify your program (keeping it such that continues to exhibit the good (expected) results on the small case and fails on large case) and such that as you enter deeper into your call levels you log inputs and as you return upwards from the call levels you log outputs. I suggest you log the information to a file then you can use a difference program to help locate where data diverge.

You might want to consider using FPP and define two macros. One to log inputs, and one to log outputs. These can be defined as "blank" when running in release mode (or when a different macro is defined/undefined).

Jim Dempsey

Christophe_O_ · ‎03-13-2017

jimdempseyatthecove wrote:

When you experience problems as described, it almost always ends up as being a latent bug in the program that didn't show up until now.

You may reduce the debugging effort by employing the technique of "peeling the onion". Modify your program (keeping it such that continues to exhibit the good (expected) results on the small case and fails on large case) and such that as you enter deeper into your call levels you log inputs and as you return upwards from the call levels you log outputs. I suggest you log the information to a file then you can use a difference program to help locate where data diverge.

I do not think it is due to a bug. I have been using this code for years and others since the 70s and it always worked well. It worked well until version 10 of intel. For later versions, it never worked.

Question: What has changed from version 10 to 11?

Since it is clearly a matter of compiler, I think the method you suggest is not feasible. There is no bug to look for.

Christophe

Kevin_D_Intel · ‎03-13-2017

The RNs contain information about changes from a previous release.

jimdempseyatthecove · ‎03-13-2017

Does your code contain convergence code? If so, then disable any of the options that favor speed (e.g. lessor precision in transcendentals for faster results), and set the -fp model to strict or source, fp-speculation to strict.

I've experienced many cases where convergence code fails due to a literal/constant being used that worked 50 years ago (several generations ago), but now suddenly fail.

Jim Dempsey

mecej4 · ‎03-13-2017

Given all the constraints stated, I see few options to resolve the problem. Without a test case, there is practically nothing that a compiler vendor can do. Given that the last version that gave correct results (possibly with an incorrect program) is a seven year old version, there can be little justification to go probing for unknown causes.

There are a couple of checks that I would perform in your position. Run the old and new compilers configured with the same non-default options as in your previous tests on one of your source files, but specify that a listing should be produced. The listings will contain a list of all the options in effect. Compare the old and new lists.

Steve_Lionel · ‎03-13-2017

I have seen many programs that have "run correctly for years" but break with a newer compiler, not because of compiler bugs (though that does happen), but because the source violates language rules in a way earlier compilers did not assume were followed. It is pointless to ask "what has changed", as there are thousands of small changes or tweaks to optimization that can change results, and there's no way for you to identify a cause by perusing such a list.

Christophe_O_ · ‎03-15-2017

Hi all,

I tried the options suggested by Jim. They had no effects.

I think that the best option, as suggested by mecej4, is to compile the code with intel 10 and with intel 16, and compare all the options in effect.

How can I produce the list of options when compiling ?

Christophe

jimdempseyatthecove · ‎03-15-2017

From mecej4's #13 post:

Run the old and new compilers configured with the same non-default options as in your previous tests on one of your source files, but specify that a listing should be produced. The listings will contain a list of all the options in effect. Compare the old and new lists.

ifort -S yourFile.f90

Jim Dempsey

mecej4 · ‎03-15-2017

On Linux, the -list option should suffice. The -S option produces an annotated assembler file, which may not be what you want at this juncture.

jimdempseyatthecove · ‎03-15-2017

Thanks for the revision. I read the forum messages on Windows and had the Windows options open.

I am surprised the optimization reports do not include the compiler options (specified on command line and implicitly used internally).

Jim Dempsey

Christophe_O_ · ‎03-17-2017

These are the options of compilation for version 10 and version 2016.

Sorry but I do not have the knowledge to understand all these options and see why version 16 does not work.

Christophe

Options Intel 10

Version 10.1
/opt/intel//fce/10.1.015/bin/fortcom -D__INTEL_COMPILER=1010 -D_MT -D__ELF__ -D__INTEL_COMPILER_BUILD_DATE=20080312 -D__unix__ -D__unix -D__linux__ -D__linux -D__gnu_linux__ -Dunix -Dlinux -D__x86_64 -D__x86_64__ -mGLOB_pack_sort_init_list -I/home/u5751/marlowe_sequential/execute -I. -I/opt/intel//fce/10.1.015/include -I/opt/intel//fce/10.1.015/substitute_headers -I/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include -I/usr/local/include -I/usr/include -I/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include "-fp_modspec fp_speculation_FAST" -V -O2 -mP1OPT_version=1010 -mGLOB_source_language=GLOB_SOURCE_LANGUAGE_F90 -mGLOB_tune_for_fort -mGLOB_use_fort_dope_vector -mP2OPT_static_promotion -mP1OPT_print_version=FALSE -mP3OPT_use_mspp_call_convention -mCG_use_gas_got_workaround=F -mP2OPT_align_option_used=TRUE "-mGLOB_options_string=-i_dynamic -v -list -c -O" -mGLOB_cxx_limited_range=FALSE -mP2OPT_eh_nirvana -mGLOB_diag_file=singleb.diag -mGLOB_as_output_backup_file_name=/tmp/ifortDILPvWas_.s -mGLOB_machine_model=GLOB_MACHINE_MODEL_EFI2 -mGLOB_fp_speculation=GLOB_FP_SPECULATION_FAST -mGLOB_extended_instructions=0x8 -mP2OPT_subs_out_of_bound=FALSE -mGLOB_ansi_alias -mIPOPT_ninl_user_level=2 -mIPOPT_args_in_regs=0 -mPGOPTI_value_profile_use=T -mGLOB_opt_level=2 -mIPOPT_activate -mIPOPT_lite -mP2OPT_hlo_level=2 -mP2OPT_hlo -mPAROPT_par_report=1 -mCG_emit_as_seg_grouping -mIPOPT_obj_output_file_name=singleb.o "-mGLOB_linker_version=(GNU Binutils) 2.21.52.20110707" -mP3OPT_asm_target=P3OPT_ASM_TARGET_GAS -mGLOB_obj_output_file=singleb.o -mGLOB_source_dialect=GLOB_SOURCE_DIALECT_FORTRAN -mP1OPT_source_file_name=/home/u5751/marlowe_sequential/execute/singleb.f /home/u5751/marlowe_sequential/execute/singleb.f

Options Intel 16

/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/bin/intel64/fortcom -D__INTEL_COMPILER=1600 -D__INTEL_COMPILER_UPDATE=3 -D__unix__ -D__unix -D__linux__ -D__linux -D__gnu_linux__ -Dunix -Dlinux -D__ELF__ -D__x86_64 -D__x86_64__ -D__amd64 -D__amd64__ -D__INTEL_COMPILER_BUILD_DATE=20160415 -D__INTEL_OFFLOAD -D__i686 -D__i686__ -D__pentiumpro -D__pentiumpro__ -D__pentium4 -D__pentium4__ -D__tune_pentium4__ -D__SSE2__ -D__SSE2_MATH__ -D__SSE__ -D__SSE_MATH__ -D__MMX__ -mGLOB_pack_sort_init_list -I/home/cortiz/marlowe_sequential/source/mpp -I. -I/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/ipp/include -I/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/mkl/include -I/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/tbb/include -I/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/daal/include -I/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/compiler/include/intel64 -I/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/compiler/include -I/usr/local/include -I/usr/lib/gcc/x86_64-redhat-linux/4.8.5/include -I/usr/include/ -I/usr/include "-V tabex.lst" -O2 -simd -offload_host -mGLOB_em64t=TRUE -mP1OPT_version=16.0-intel64 -mGLOB_diag_file=/tmp/ifortBwXNiq.diag -mGLOB_source_language=GLOB_SOURCE_LANGUAGE_F90 -mP2OPT_static_promotion -mP1OPT_print_version=FALSE -mCG_use_gas_got_workaround=F -mP2OPT_align_option_used=TRUE -mGLOB_gcc_version=485 "-mGLOB_options_string=-v -list -O -o mpp -L/home/cortiz/marlowe_sequential/execute/object -ltk" -mGLOB_cxx_limited_range=FALSE -mCG_extend_parms=FALSE -mGLOB_compiler_bin_directory=/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/bin/intel64 -mGLOB_as_output_backup_file_name=/tmp/ifortgKhZvxas_.s -mGLOB_dashboard_use_source_name -mIPOPT_activate -mIPOPT_lite -mGLOB_instruction_tuning=0x0 -mGLOB_product_id_code=0x22006d8f -mCG_bnl_movbe=T -mGLOB_extended_instructions=0x8 -mP3OPT_use_mspp_call_convention -mP2OPT_subs_out_of_bound=FALSE -mP2OPT_disam_type_based_disam=2 -mGLOB_ansi_alias -mPGOPTI_value_profile_use=T -mGLOB_opt_report_use_source_name -mP2OPT_il0_array_sections=TRUE -mGLOB_offload_mode=1 -mP2OPT_offload_unique_var_string=ifort0101665725369tR57x4 -mGLOB_opt_level=2 -mP2OPT_hlo_level=2 -mP2OPT_hlo -mP2OPT_hpo_rtt_control=0 -mIPOPT_args_in_regs=0 -mP2OPT_disam_assume_nonstd_intent_in=FALSE -mGLOB_imf_mapping_library=/opt/intel/2016_update3/compilers_and_libraries_2016.3.210/linux/bin/intel64/libiml_attr.so -mPGOPTI_gen_threadsafe_level=0 -mIPOPT_lto_object_enabled -mIPOPT_lto_object_value=1 -mIPOPT_obj_output_file_name=/tmp/ifortR8f4pL.o -mIPOPT_whole_archive_fixup_file_name=/tmp/ifortwarchEU0YMm -mGLOB_linker_version=2.23.52.0.1 -mGLOB_long_size_64 -mGLOB_routine_pointer_size_64 -mGLOB_driver_tempfile_name=/tmp/iforttempfilexjnZ3b -mP3OPT_asm_target=P3OPT_ASM_TARGET_GAS -mGLOB_async_unwind_tables=TRUE -mGLOB_obj_output_file=/tmp/ifortR8f4pL.o -mGLOB_source_dialect=GLOB_SOURCE_DIALECT_FORTRAN -mP1OPT_source_file_name=/home/cortiz/marlowe_sequential/source/mpp/tabex.f -mP2OPT_symtab_type_copy=true /home/cortiz/marlowe_sequential/source/mpp/tabex.f

Steve_Lionel · ‎03-17-2017

I will note that all of the mGLOB and mPxxx options are undocumented internal options not exposed on the command line. These are generated by the compiler driver when it invokes the actual compiler.

But looking at the options is a waste of time. Instead, instrument the program with intermediate results displayed, run both versions and see where they start to differ. You should then easily be able to identify a particular operation that gives different results. Only then can you start to figure out why this is happening. Might be a compiler bug, might be a new optimization whose assumptions aren't met, might be a latent source error.

mecej4 · ‎03-17-2017

Apart from all the other differences that one can see in the expanded options lists in #19, which may or may not matter, there is one glaring difference. With Ifort-10.1 you compiled source file singleb.f. With Ifort-2016, you compiled source file tabex.f.

I second Steve Lionel's advice in #14. I have also come across codes used by tens of thousands of users for decades that contained sporadic bugs. By chance, I encountered them in one case. Using proper tools, I was able to pinpoint the error and I communicated the finding to the govt. lab that created and maintain the software. They acknowledged that they knew the existence of the bugs, but said that the author of the code had died and they were going to let things stay as they were.

It is up to you to construct an example code that you can share here and establish that there is a valid complaint.