Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

truncated strings

DavidWhite
Valued Contributor II
2,268 Views

I was just surprised by a bug in my code about which I did not get a warning:

character(LEN=4) :: name

name = "DAVID"

There is no compiler warning that "DAVID" gets truncated when stored in name.

Is there any compiler setting to force this warning?  I could not see one.

Thanks,

David

0 Kudos
23 Replies
GVautier
New Contributor II
1,784 Views
Hello It's an intrinsic behavior of fortran. It's allowed to store a content in a variable with a actual size smaller than the content. integer*4 to an integer*2 variable character*100 to a character*2 variable and so on So there is no warning and the result can be not that you may have expected. Be careful
0 Kudos
mecej4
Honored Contributor III
1,784 Views

As GVautier points out, assigning a 5-character value to a 4-character variable is not a bug, but a feature of the language. It is a bug only in the sense that the code does something different from what you intended.

A compiler, however, may be able to help you catch such code. There are also many utilities and Lint-like tools that you can run your code through. Ftnchek says:

      4 name = "DAVID"
             ^
Warning near line 4 col 6 file dav.f90: char*5 const "DAVID" truncated to
 char*4 NAME

GFortran with -Wall says something very similar to this. Silverfrost FTN95 says

0004) name = "DAVID"
COMMENT - Only the first 4 character(s) of this constant will be used
    NO ERRORS, 1 COMMENT  [<XCHAR> FTN95/Win32 v8.00.0]

Note the level of the message: COMMENT, not WARNING or ERROR.

0 Kudos
andrew_4619
Honored Contributor II
1,784 Views

You could have made it allocatable maybe and it would allocate on assignment or maybe define it as a parameter which I think would gibve an error on length mismatch.

0 Kudos
JohnNichols
Valued Contributor III
1,783 Views

Volunteer programmers must be proficient in C, the language in which ftnchek is written.

Interesting statement on the FTNCHEK website -- thanks for letting me know about the program.

John

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,784 Views

Andrew's suggestion will make the error message go away...
However, it may also introduce a hidden error in your program should the remainder of the program require name be 4 characters in length.

Be careful of making changes to code when you do not fully comprehend the full implications of such change.

Jim Dempsey

0 Kudos
JohnNichols
Valued Contributor III
1,783 Views

https://www.youtube.com/watch?v=eWQIryll8y8

Shows an excellent example of what Jim is talking about

0 Kudos
andrew_4619
Honored Contributor II
1,784 Views

jimdempseyatthecove wrote:
Andrew's suggestion will make the error message go away...

However, it may also introduce a hidden error in your program should the remainder of the program require name be 4 characters in length.

Be careful of making changes to code when you do not fully comprehend the full implications of such change.

Jim Dempsey

Indeed the need to analyse  the consequences of any change goes without saying. Matching character declaration lengths to an actual hard coded string is rather error prone IMO. "How  many characters are there  in this sentence?". The dynamic allocation and later use of len(variable_name) if required makes life simpler or you are left to guess a length that is much longer than you really need .

0 Kudos
dboggs
New Contributor I
1,784 Views

I used to be plagued by this a lot, and eventually learned something that others have not really pointed out. The real danger of this behavior (which I agree is a "feature" of the language) is not just that the name variable will be truncated. In fact that result should be readily apparent and not difficult to debug. The REAL problem is that, by assigning something longer than the allocated storage space, you are inadvertently overwriting a storage cell that may belong to something else. That "something else" is often some unused memory and is harmless, but it can easily belong to something important, which is then corrupted. That corrupted something can cause bizarre, unexpected, and unpredictable behavior that is extremely difficult to debug. All part of the joy of programming in Fortran.

0 Kudos
FortranFan
Honored Contributor II
1,784 Views

dboggs wrote:

I used to be plagued by this a lot, and eventually learned something that others have not really pointed out. The real danger of this behavior (which I agree is a "feature" of the language) is not just that the name variable will be truncated. .. The REAL problem is that, by assigning something longer than the allocated storage space, you are inadvertently overwriting a storage cell that may belong to something else. That "something else" is often some unused memory and is harmless, but it can easily belong to something important, which is then corrupted. That corrupted something can cause bizarre, unexpected, and unpredictable behavior that is extremely difficult to debug. All part of the joy of programming in Fortran.

All such problems for a simple assignment involving a truncated CHARACTER expression, as shown in the original post!?  I find that very hard to believe, it must be for some other, more complex situations or non-standard code or deprecated coding practices.

0 Kudos
mecej4
Honored Contributor III
1,784 Views

The problems that DBoggs describes may happen in general (I suppose the infamous Bufferoverflow of C covers such things), but for the specific case of #1, wherein a character variable of length 4 is assigned a string of length 5, there can be no clobbering of adjacent memory. The Fortran standard specifies that when a character variable is set equal to a character expression, the latter is truncated or padded with blanks to match the length of the variable. Please see 7.2.1.3, numbered item 10.

10 For an intrinsic assignment statement where the variable is of type character, the expr may have a different character length parameter in which case the conversion of expr to the length of the variable is as follows. (1) If the length of the variable is less than that of expr, the value of expr is truncated from the right until it is the same length as the variable. (2) If the length of the variable is greater than that of expr, the value of expr is extended on the right with blanks until it is the same length as the variable.

If the variable is allocatable, etc., rather than a simple character variable of known length, I suppose bad things can happen. 

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,784 Views

>>The REAL problem is that, by assigning something longer than the allocated storage space, you are inadvertently overwriting a storage cell that may belong to something else.

Won't happen. The string gets truncated (or space padded in the event the input string is shorter than the output string).

Jim Dempsey

0 Kudos
DavidWhite
Valued Contributor II
1,784 Views

If GFortran and Silverfrost can issue a warning when attempting to assign an oversized string, I see no reason why IVF cannot do the same.

In my code that triggered this, I use a code template for testing a large number of subroutine calls.  The maximum length I gave to the string variable was quite reasonable, what I did not catch was that the code I am linking with has subroutine names of increasing length, and so I got caught out.

It would be great if I could be sure that such errors would be caught at compile time in future.

David

0 Kudos
TimP
Honored Contributor III
1,784 Views

We have run into bugs where a longer string was assigned in another procedure.  This will be caught by turning on interface checks.  In the actual case it caused no problem for months until inlining caused it to overwrite another variable.

0 Kudos
Steven_L_Intel1
Employee
1,784 Views

I will add this to the list of "usage warnings" that have been suggested.

0 Kudos
dboggs
New Contributor I
1,784 Views

 

>>The REAL problem is that, by assigning something longer than the allocated storage space, you are inadvertently overwriting a storage cell that may belong to something else.

Won't happen. The string gets truncated (or space padded in the event the input string is shorter than the output string).

I made that statement a little to hastily, and it wasn't quite correct. Yes, it won't happen exactly like I described. What I was referring to (along with some memory struggle!) was one or more related activity, such as EQUIVALENCE or COMMON trickery, or (more likely) an internal write. I think that severe and hard-to-debug trouble can occur if a long character string is accidentally written (via internal write) to a character variable of shorter declared length.

It would be good to run a simple test to determine if this is in fact true, but someone here probably already knows?

CHARACTER(3) :: cthree
CHARACTER(5) :: cfive
INTEGER :: a, b, c
COMMON cthree, a, b, c ! Just to ensure that a is stored immediately after cthree in memory
WRITE (cthree, '(A)') cfive
! Variable a will now be corrupt?

0 Kudos
GVautier
New Contributor II
1,784 Views
Hello Nothing wrong will happen except an IO error because the write result is too long for the character variable. That's all. The only problematic thing about strings but that's not really related to the topic it's using length declared character argument in subroutine and pass real argument of shorter length. Ex :
subroutine test(string)
character*50 string
string=""
end subroutine

character*20 string20
call test(string20)
The only solution I found is to replace length declared character argument by character*(*) declaration
0 Kudos
andrew_4619
Honored Contributor II
1,784 Views

The example in #17  would give compiler error >> "error #7938: Character length argument mismatch."

0 Kudos
mecej4
Honored Contributor III
1,784 Views

andrew_4619 wrote:

The example in #17  would give compiler error >> "error #7938: Character length argument mismatch."

Only if both subprograms are in the same file, or a compiler option is specified that checks for such mismatches, or if extra code is generated to do similar checks at run time.

In a release mode compilation, the .OBJ file for the subroutine does not use the second argument on the stack, which is the hidden (in Fortran source code) string length argument. Instead, it uses the incorrectly declared length, 50. If the length of the string argument is re-declared as (*), the length is taken from the hidden argument on the stack.

One can check the disastrous effects of such errors with the code of #17. I put the subroutine and calling program into separate files. The resulting EXE hangs when run, and I had to abort it with Ctrl+C. Worse things can happen with bigger programs, which is why it is important to ensure that code does not contain such interface mismatches.

0 Kudos
andrew_4619
Honored Contributor II
1,784 Views

mecej4 wrote:
One can check the disastrous effects of such errors with the code of #17. I put the subroutine and calling program into separate files. The resulting EXE hangs when run.......

All you say is correct, however....

It wouldn't hang for me as it would not build. I would never compile without interfaces checking** and for that matter external routines have no place in my world. I guess if someone adopts less than ideal coding practices and also chooses to not use the options that find errors at compiler time then good luck to them!

** yes I do realise if you have inherited some ancient/non-standard Fortran that might not be an option in the first instance (been there got the T shirt) ......

 

0 Kudos
LRaim
New Contributor I
1,500 Views

This discussion seems a bit absurd.
If I have a char*80 string VTEXT and want to analyze the first 8 chars why should not write:
character*8 code
code = vtext

 

 

 

 

0 Kudos
Reply