Solved: Problem reading real numbers in scientific notation

Akindele__Tunde · ‎03-08-2022

I am reading some comma separated real numbers from a file into an array. The real numbers are mixtures of regular real numbers and in scientific notation, as shown below:

0.000372526,8.61833E-06,0.000422635,0.000445738,0.000445062,0.000276,9.74E-05,0.00022,8.63708E-06

The problem is that the numbers in scientific notation were read wrongly. 8.61833E-06 was read as 6.1833E-07, while 9.74E-05 and 8.63708E-06 were read as 7.4000E-06 and 6.3708E-07 respectively.

To confirm what was going on, I duplicated the data as follows in the file:

0.000372526,8.61833E-06,0.000422635,0.000445738,0.000445062,0.000276,9.74E-05,0.00022,8.63708E-06
0.000372526,0.861833E-05,0.000422635,0.000445738,0.000445062,0.000276,0.974E-04,0.00022,0.863708E-05
0.000372526,0.00000861833,0.000422635,0.000445738,0.000445062,0.000276,0.0000974,0.00022,0.00000863708

In the second line, I changed the numbers in the scientific notation to make them all mantissa, while in the third line, scientific notation was avoided. The last two lines were read correctly.

The small program that I wrote for testing is as follows:

program Test

implicit none

INTEGER, PARAMETER:: COUNT=9
INTEGER::ECODE, I
CHARACTER(LEN=80):: MSG
CHARACTER(LEN=1):: COMMA
REAL,DIMENSION(COUNT):: FIRST,SECOND,THIRD

OPEN(UNIT=10,FILE='TestData.txt',STATUS='OLD')
READ(10,'(<COUNT>(F,A1))')(FIRST(I),COMMA,I=1,COUNT)
READ(10,'(<COUNT>(F,A1))')(SECOND(I),COMMA,I=1,COUNT)
READ(10,'(<COUNT>(F,A1))')(THIRD(I),COMMA,I=1,COUNT)

WRITE(*,'(5X,A5,11X,A6,10X,A5)') 'FIRST', 'SECOND', 'THIRD'
DO I=1,COUNT
WRITE(*,'(I2,A1,2X,F12.10,4X,F12.10,4X,F12.10)') I, ':', FIRST(I), SECOND(I), THIRD(I)
ENDDO

CLOSE(10)

end program Test

Using the "E" or "G" format in the read statement produced the same results.

The results from the program is listed below:

FIRST SECOND THIRD
1: 0.0003725260 0.0003725260 0.0003725260
2: 0.0000006183 0.0000086183 0.0000086183
3: 0.0004226350 0.0004226350 0.0004226350
4: 0.0004457380 0.0004457380 0.0004457380
5: 0.0004450620 0.0004450620 0.0004450620
6: 0.0002760000 0.0002760000 0.0002760000
7: 0.0000074000 0.0000974000 0.0000974000
8: 0.0002200000 0.0002200000 0.0002200000
9: 0.0000006371 0.0000086371 0.0000086371

How can I make the data in the first line to be read correctly?

mecej4 · ‎03-08-2022

Do you understand the implications of using the format descriptor F rather than Fw.d? The default field width w when you do not specify it is, in Intel Fortran, 12 characters. Thus, from the input line

0.000372526,8.61833E-06
.........1.........2

the first 12 characters, i.e., "0.000372526,", are used to match the F, and the next character, "8", matches the A1 format descriptor. That leaves ".61833E-6" for matching the next repetition of the format group.

Instead of using these complicated features, why not use a feature of the language that enables reading this kind of data with much less trouble -- list-directed input:

read(10,*)FIRST
read(10,*)SECOND
read(10,*)THIRD

View solution in original post

mecej4 · ‎03-08-2022

Do you understand the implications of using the format descriptor F rather than Fw.d? The default field width w when you do not specify it is, in Intel Fortran, 12 characters. Thus, from the input line

0.000372526,8.61833E-06
.........1.........2

the first 12 characters, i.e., "0.000372526,", are used to match the F, and the next character, "8", matches the A1 format descriptor. That leaves ".61833E-6" for matching the next repetition of the format group.

Instead of using these complicated features, why not use a feature of the language that enables reading this kind of data with much less trouble -- list-directed input:

read(10,*)FIRST
read(10,*)SECOND
read(10,*)THIRD

Steve_Lionel · ‎03-08-2022

Do not use F, E, D or G without widths. This is an extension, and you are at the mercy of the implementation for the width it uses. There's an additional extension at work, called "short field termination", where the comma shortens the field width. This causes further problems.

I agree with @mecej4 that list-directed input is what you want here. (Just be aware that for output, list-directed cedes control of the format to the implementation, and this may result in formatting you don't want.

JohnNichols · ‎03-08-2022

Can I suggest that there are a lot of excellent books on Fortran, you will learn a lot of tricks such as shown above, which will reduce your frustration at writing code.

We have all been there we have all learnt.

mecej4 · ‎03-09-2022

Short-field-termination is denoted as an extension in the Intel Fortran documentation (sixth paragraph under "Description").

Is it possible to build an EXE/a.out from Fortran source using the Intel Fortran compiler without this extension?

I tried the /standard-semantics compiler option, but that had no effect on the behavior of the test program and data that were presented in this thread.

Steve_Lionel · ‎03-09-2022

No, you can't turn off short field termination. The alternative behavior would be a run-time error.