Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29245 Discussions

Strange issue with Intel 10.1 fortran Compiler

fivos
Beginner
909 Views
Hi to everyone,

I am facing a very strange issue with theIntel fortran compiler 10.1 :
I have created a source code using CVF (Compaq Visual Fortran) on a Windows XP pc. After some modifications I have found that a subroutine in the source code was unecessary, so I removed it. I also removed thecommon blocks for the main program and the deleted subroutine.The source code comiles with CVF giving no warnings and no errors and the executable runs as it should. (The executable is estimated by the CVF to consume ~ 640MB of RAM)

Now I want to compile and run that source code on a Red Hat linux machine. So I compile the same source code usingthe intel fortran compiler 10.1, installed on the linux machine.Again no warnings and no errors. But when I try to run it, the program hangs and also, from what I can see through the top command, it consumes unexplainably highCPU andall the available memory (plus swap). The linux machine is a 2 x Quad Core Xeon andhas8GB of RAM.

The unmodified version/initial of the code worked perfectly at both machines. After changing step by step the code I have found that :
- if I justdelete the subroutine and retain the common block in the main program,the executable runs.
- the problem occurs after deleting thecommon block of the shared variables (between the main program and the deleted subroutine)in the main program, even if there is no other subroutine using/sharing these variables.
After more tests,I have found that this happens after removing a specific variable from the block.Again I repeat that this variable is not used by any other subroutines and is defined in the main program before any calculations.

I can't understand :

- Why the program hangs on the intel compiler since I explicitly define the specific variable in the main program?

- Also why the executable from the intel fortran hangs, whereas the executable from CVF works as it should?

Nospecial flags, optimizations etc. were usedin both the CVF and Intel Fortran.

Could this be a compiler bug? Or am I missing something?

Any ideas would be appreciated.

Thanks in advance.
0 Kudos
1 Solution
jimdempseyatthecove
Honored Contributor III
909 Views
Quoting - tim18
It's always risky to fiddle with working COMMON blocks in legacy code. I don't know how you could tell by inspection that the program won't break with such a change. You would want at least to run with -check before and after the change.

I agree with Tim. As long as you use COMMON blocks from a working legacy application, do not touch them, remove them, or move them. Too often is the case that when you do this, a "temp" variable that is used from one COMMON block gets selected and used from a different COMMON block, and often with disastrous consequences.

Ask yourself: How many of your named COMMON common blocks use the variable name TEMP (or X, or Y, or some other innocuous name).

By removing, reordering COMMON blocks, or adding or removing variables within COMMON blocks you run the risk of disturbing a working environment.

One of the later posters suggest using Modules. I agree with this. The strategy that I suggest is to create, with a module, a user defined type and instance of a single variable containing the named common block. The following instructions may seem at first overly complicated. However, the instructions are designed to aid you in locating potential pitfalls later, and are designed to accelerate the conversion process.

COMMON /abc/ ...

becomes

module mod_abc
type t_abc
sequence
... ! your variables here
end type t_abc

type(t_abc) :: COMMON_abc ! temporarily prefix with COMMON_
end module mod_abc

Then add USE mod_abc to the functions and subroutines that were using the COMMON.
Do this for all of your common blocks
Make sure you insert IMPLICIT NONE into your subroutines and functions

Now compile and get a bazillion errors

For a subroutine containing an error, identify the variables that now reference variablescontained within a new module. Create a new file in your project, where you store your modules, named DEF_abc.INC which follows the naming convention of your MOD_abc.MOD file

Into this file insert, for example when TEMP were missing

! DEF_abc.INC - defines for MOD_abc.MOD
#define TEMP COMMON_abc%TEMP

Add the defines for the other variables *** that you know used to be contained within the COMMON block named abc.

Add to this source file, an FPP include

#include "DEF_abc.INC"


*** insert the #include at/near the top of the source file, NOT inside the subroutine
The FPP defines are not scoped. The defines are active from the #include statement downwards through the file.
Add the preprocess file to the project

Compile the first file with error (from the bazillion errors pass).
Once you work out the compilation errors from this first file

Place the #include "DEF_abc.INC"

Perform a global (solution)file search for all files now containing MOD_abc.
Go to top of file,(benieth the title line) paste in the #include "DEF_abc.INC"
repeat for remaining files.

Run Build All.Identify new errors referencing the variablesreferenced from the old COMMON /abc/ and add the #defines into the "DEF_abc.INC"
repeat until no more errors of variables referencing theold COMMON /abc/
What variables that are NOT#defined in the "DEF_abc.INC" are variables that are not (no longer) used. Do not delete them at this time. (you may have other solutions or projects that use them)

Continue on to resolve the errors for the next MOD_xyz file containing the variables that used to be contained in COMMON /xyz/

Thenon to the next file,etc...

For unnamed common blocks this is a little trickier as you do not know what variables are used to pass data from routine to routine and which variables are temps. I recommend you construct individual modules per subroutine under the assumption that the majority of the variables will be local temporaries. As you identify the variables that are globally shared create a MOD_all (and appropriate DEF_all.INC)
When you identify variables shared only by a subset of files create a separate shared module.


Once you have no errors, enable allrun time error checks, cross your fingers, and make a test run.

Work out any problems

When you are satisfied this is working properly

Remove the "COMMON_" prefix from oneof the user defined types (COMMON_abc becomes abc)
Search for DEF_abc.INC in your sorce file and remove this line from all files
DO NOT work on more than one COMMON_ at a time (you may have shared names in multiple formercommons)

compile and get your errors
On each variable with error add the "abc%" (replacing abc with the appropriatename)
Build again until you get no additional errors

Move on to the next DEF_....INC file
repeat until done

Once you have no errors, enable allrun time error checks, cross your fingers, and make a test run.

Now that you have the things in separate modules, you will find it much easier to maintain, and more importantly, easier to add in OpenMP since it is easier to isolate variablesand avoid adverse interactions.

Jim Dempsey




View solution in original post

0 Kudos
6 Replies
TimP
Honored Contributor III
909 Views
It's always risky to fiddle with working COMMON blocks in legacy code. I don't know how you could tell by inspection that the program won't break with such a change. You would want at least to run with -check before and after the change.
0 Kudos
clabra
New Contributor I
909 Views
When I change my code from CVF to intel on linus, I moved the COMMON blocks to modules and work better.
0 Kudos
fivos
Beginner
909 Views

First of all, I have to thank you for replying my question.

I have tried to compile the source code, which causes issues using the 10.1 intel compiler, with the 11.0 intel fortran compiler. Now the executable does not hang, instead it gives the following :

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source

Stack trace terminated abnormally.

Again if I leave the specific common block with the specific variable intact there is no problem and the program runs as it should. Now I know that SIGSEGV means that the program tries to read/ write to inaccesible memory. How can I debug this so it wouldn't bother me any longer?

Any ideas?


UPDATE :

Now the thing starts to become even more crazy! If I use the debug flag (ifort -debug) the program runs without giving the aforementioned error !!
What is this??!?


0 Kudos
jimdempseyatthecove
Honored Contributor III
910 Views
Quoting - tim18
It's always risky to fiddle with working COMMON blocks in legacy code. I don't know how you could tell by inspection that the program won't break with such a change. You would want at least to run with -check before and after the change.

I agree with Tim. As long as you use COMMON blocks from a working legacy application, do not touch them, remove them, or move them. Too often is the case that when you do this, a "temp" variable that is used from one COMMON block gets selected and used from a different COMMON block, and often with disastrous consequences.

Ask yourself: How many of your named COMMON common blocks use the variable name TEMP (or X, or Y, or some other innocuous name).

By removing, reordering COMMON blocks, or adding or removing variables within COMMON blocks you run the risk of disturbing a working environment.

One of the later posters suggest using Modules. I agree with this. The strategy that I suggest is to create, with a module, a user defined type and instance of a single variable containing the named common block. The following instructions may seem at first overly complicated. However, the instructions are designed to aid you in locating potential pitfalls later, and are designed to accelerate the conversion process.

COMMON /abc/ ...

becomes

module mod_abc
type t_abc
sequence
... ! your variables here
end type t_abc

type(t_abc) :: COMMON_abc ! temporarily prefix with COMMON_
end module mod_abc

Then add USE mod_abc to the functions and subroutines that were using the COMMON.
Do this for all of your common blocks
Make sure you insert IMPLICIT NONE into your subroutines and functions

Now compile and get a bazillion errors

For a subroutine containing an error, identify the variables that now reference variablescontained within a new module. Create a new file in your project, where you store your modules, named DEF_abc.INC which follows the naming convention of your MOD_abc.MOD file

Into this file insert, for example when TEMP were missing

! DEF_abc.INC - defines for MOD_abc.MOD
#define TEMP COMMON_abc%TEMP

Add the defines for the other variables *** that you know used to be contained within the COMMON block named abc.

Add to this source file, an FPP include

#include "DEF_abc.INC"


*** insert the #include at/near the top of the source file, NOT inside the subroutine
The FPP defines are not scoped. The defines are active from the #include statement downwards through the file.
Add the preprocess file to the project

Compile the first file with error (from the bazillion errors pass).
Once you work out the compilation errors from this first file

Place the #include "DEF_abc.INC"

Perform a global (solution)file search for all files now containing MOD_abc.
Go to top of file,(benieth the title line) paste in the #include "DEF_abc.INC"
repeat for remaining files.

Run Build All.Identify new errors referencing the variablesreferenced from the old COMMON /abc/ and add the #defines into the "DEF_abc.INC"
repeat until no more errors of variables referencing theold COMMON /abc/
What variables that are NOT#defined in the "DEF_abc.INC" are variables that are not (no longer) used. Do not delete them at this time. (you may have other solutions or projects that use them)

Continue on to resolve the errors for the next MOD_xyz file containing the variables that used to be contained in COMMON /xyz/

Thenon to the next file,etc...

For unnamed common blocks this is a little trickier as you do not know what variables are used to pass data from routine to routine and which variables are temps. I recommend you construct individual modules per subroutine under the assumption that the majority of the variables will be local temporaries. As you identify the variables that are globally shared create a MOD_all (and appropriate DEF_all.INC)
When you identify variables shared only by a subset of files create a separate shared module.


Once you have no errors, enable allrun time error checks, cross your fingers, and make a test run.

Work out any problems

When you are satisfied this is working properly

Remove the "COMMON_" prefix from oneof the user defined types (COMMON_abc becomes abc)
Search for DEF_abc.INC in your sorce file and remove this line from all files
DO NOT work on more than one COMMON_ at a time (you may have shared names in multiple formercommons)

compile and get your errors
On each variable with error add the "abc%" (replacing abc with the appropriatename)
Build again until you get no additional errors

Move on to the next DEF_....INC file
repeat until done

Once you have no errors, enable allrun time error checks, cross your fingers, and make a test run.

Now that you have the things in separate modules, you will find it much easier to maintain, and more importantly, easier to add in OpenMP since it is easier to isolate variablesand avoid adverse interactions.

Jim Dempsey




0 Kudos
fivos
Beginner
909 Views
After extensive testing, debugging etc. I was able to track down and find the reason of the errors.From what I have seen, the intel fortran compilerintializes all variables declared as common with the zero value (0). Else, if a value is declared, but not initialized, it is given a random value (or better, anunpredictable numbersuch as 3475900, in my case). These values, if given to array indexes, cause the segmentation violation that I have experienced.

Unfortunately the CVF automatically initiallizes alldeclared values to zero (common or not). That is the reason why : - the executable with the common block built by the two compilers run properly
- the executablefrom the modifiedsource code (without the common block)created by the CVF was able to run, but the other created by the Intel compiler didn't run.

Well it seems that I have to be more carefull with the initialization of variables next time.

Again I have to thank the otherpostersfor their time and effort.


0 Kudos
Steven_L_Intel1
Employee
909 Views

No, Intel Fortran does not initialize COMMON variables to zero. No, CVF did not initialize all variables to zero. You may have found variables that started with these values, but it was not a deliberate action of the compiler. Uninitialized variables have unpredictable values. It is a frequent programming error to assume variables start with a value of zero.

It is true that in Intel Fortran you are more likely to find non-COMMON, scalar variables to have non-zero initial values because they are allocated on the stack by default, whereas CVF allocated all variables statically by default, but it is still a programming error if your program breaks because of this.
0 Kudos
Reply