How to read numbers from a string

haijunwu · ‎11-16-2011

Hi, I met a problem in reading file.

Suppose the following string is what I want to read

"The first is an integer:100 and the second is a real 200.00"

what I am sure is that the first number is an integer and the second numberis real which are randomed distributed in a sting.

Thanks in advance!

bmchenry · ‎11-16-2011

Basically youread and parse...
read the line in as a string.
Thenparse the string looking for numbers 0-9 and any period '.' (or comma ',') in the collection of numbers
once found,
write the numbers to a separate string (or keep track where in the string they are located)
read until a space or something other than 0-9, . (or ,) is encounterd
if a period (comma)is in the collection of digits (or simply after) it is a real number.
if no period (comma)it is an integer.

Then 'read' the string as either a real or integer number

move on to the next line!

JVanB · ‎11-16-2011

Don't forget the leading sign on -100 and the sign of the exponent in +6.626e-34.
Check for trailing junk: 100 .01 Get 100's of prescription meds COD!
The first number is 47.100 is the second number.
Also 600 700 looks like valid input.
Also 600 7e2 should be OK.
Do you allow 600 7.0d2?
Or even 600 7.0q2?
Try 600 700.00.

onkelhotte · ‎11-16-2011

We use this function for converting a string into an real. The number will be deleted from the string as well. When you read an integer, just perform

[bash]myInt = int(charzahl(string))[/bash]

I know its bad code, a former colleague wrote this years ago...

Markus

[bash]! **********************************************************************
      real(kind=4) function CHARZAHL (TEXT)
! **********************************************************************

      CHARACTER*(*) TEXT,T*1,tt*30
      real(kind=4)  x
	  integer(kind=4) l, i
	  
      L=LEN(TEXT)
1     if(TEXT(:1).GE.'-'.AND.TEXT(:1).LE.'9'.AND.TEXT(:1).NE.'/')GOTO 2
      TEXT=TEXT(2:)
      L=L-1
      if(L.GT.0) GOTO 1
      CHARZAHL=0.
      RETURN

2     I=1
3     I=I+1
      if(I.GT.L) GOTO 4
      T=TEXT(I:I)
      if(T.EQ.'.'.OR.T.GE.'0'.AND.T.LE.'9'.or.t.eq.'e') GOTO 3

4     I=I-1
      tt=text(:i)
      READ (tt,*,err=5) x
      CHARZAHL=x

      if(I.LT.L)then
         TEXT=TEXT(I+1:)
      ELSE
         TEXT=' '
      ENDIF

      RETURN

5     charzahl=0.
      return    
      END function CHARZAHL
[/bash]

mecej4 · ‎11-16-2011

There are at least two problems with this code.

1. It will not read integers with more than 24 significant bits correctly. Try this program:

[fxfortran]      program tst
        character(len=10) :: str='1234567890'
        integer :: i
        real charzahl
        i = int(charzahl(str))
        write(*,*)' i = ',i
        stop
      end program tst
[/fxfortran]

2. It returns 0.0 when the input number is 0.0 OR when an occur occurs. Therefore, if the input may contain 0.0, an input error will go undetected and the results will be erroneous.

onkelhotte · ‎11-17-2011

Hi mecej4,

thanks for the advice with high integer values. This error wasnt noticed in our departement over nearly 2 decades :-)

The behaviour, that the function return an 0.0 when the is an error, the string has no number or the string is actually 0.0: Thisis intended. But do youhave an idea howto filter an error? Maybe using HUGE for an error?

Markus

[bash]! **********************************************************************
      real(kind=4) function stringToReal (TEXT)
! **********************************************************************

      CHARACTER*(*) TEXT,T*1,tt*255
      real(kind=4)  x
	  integer(kind=4) l, i
	  
      L=LEN(TEXT)
1     if(TEXT(:1).GE.'-'.AND.TEXT(:1).LE.'9'.AND.TEXT(:1).NE.'/')GOTO 2
      TEXT=TEXT(2:)
      L=L-1
      if(L.GT.0) GOTO 1
      stringToReal=0.
      RETURN

2     I=1
3     I=I+1
      if(I.GT.L) GOTO 4
      T=TEXT(I:I)
      if(T.EQ.'.'.OR.T.GE.'0'.AND.T.LE.'9'.or.t.eq.'e') GOTO 3

4     I=I-1
      tt=text(:i)
      READ (tt,*,err=5) x
      stringToReal=x

      if(I.LT.L)then
         TEXT=TEXT(I+1:)
      ELSE
         TEXT=' '
      ENDIF

      RETURN

5     stringToReal=0.
      return    
      END function stringToReal

! **********************************************************************
      integer(kind=4) function stringToInt (TEXT)
! **********************************************************************

      CHARACTER*(*) TEXT,T*1,tt*255
      integer(kind=4)  x
	  integer(kind=4) l, i
	  
      L=LEN(TEXT)
1     if(TEXT(:1).GE.'-'.AND.TEXT(:1).LE.'9'.AND.TEXT(:1).NE.'/')GOTO 2
      TEXT=TEXT(2:)
      L=L-1
      if(L.GT.0) GOTO 1
      stringToInt=0.
      RETURN

2     I=1
3     I=I+1
      if(I.GT.L) GOTO 4
      T=TEXT(I:I)
      if(T.GE.'0'.AND.T.LE.'9') GOTO 3

4     I=I-1
      tt=text(:i)
      READ (tt,*,err=5) x
      stringToInt=x

      if(I.LT.L)then
         TEXT=TEXT(I+1:)
      ELSE
         TEXT=' '
      ENDIF

      RETURN

5     stringToInt=0
      return    
      END function stringToInt[/bash]

mecej4 · ‎11-17-2011

>Thanks for the advice with high integer values. This error wasnt noticed in our department over nearly 2 decades :-)

Well, it is likely that in the applications that it was designed for and in which it was frequently used, integers larger than that limit were never encountered (or were encountered but the consequent error was not noticed!) -- somewhat like using two-digit years and the Y2K problem.

I am curious to know if this function was used extensively on computers that used different floating point representations than IEEE-32 bit (e.g., IBM mainframe, VAX, etc.)

>The behaviour, that the function return an 0.0 when the is an error, the string has no number or the string is actually 0.0: Thisis intended. But do youhave an idea howto filter an error? Maybe using HUGE for an error?

If returning 0.0 when an error occurs is intended and acceptable, all that you have to do is to add a suitable comment to the source code, especially when you give it to others.

A better solution would be to set the result to NaN (or, if you prefer, HUGE), provided this is again advertised as a comment in the code and the user can check for such values. Otherwise, an error message should be printed, in a fashion similar to assertions in C/C++ code.

It is probably feasible to run tests for at least the integer conversion to see if the correct output is given for all possible inputs.

Please note that kind numbers, as in 'kind=4' are not portable.

jimdempseyatthecove · ‎11-17-2011

I agree with mecej4's recommendation of returning NaN since this value is "sticky" and should draw attention to "no value".

However, this may cause problems unforseen by us non-users of your application.
For instance, the input data may be arriving from an input text parameter file, when some user may have had:

123 Along with comment
456 more comments
0 Zero here
Text Here formerly returning 0.0
789 more data

Additionaly, by previous convention, an input line with no data (blank line) may be interpreted as 0.0 by your input parser.

Note, if NaN is not appropriate, you could consider returning -0.0

RetVal = TRANSFER(Z'80000000',RetVal)

Or return TINY or -TINY or HUGE or -HUGE as mecej4 also suggests

Therefore, you will have to check your input files as well as the code to see how it handles "comment text" embedded with the data.

Jim Dempsey

nvaneck · ‎11-17-2011

This one works....

REAL(8) FUNCTION DCONV(STRING,IERR)

IMPLICIT NONE

CHARACTER(LEN=*) STRING

LOGICAL IERR,NEG,IFDEC

INTEGER(4) I,L,J,NDEC,IEXP,ICONV

INTEGER(4),PARAMETER :: PERIOD=-2,MINUS=-3,PLUS=-5,BLANK=-16

DCONV=0D0

NEG = .FALSE.

IERR = .FALSE.

IFDEC = .FALSE.

L=LEN_TRIM(STRING)

DO I=1,L

J=ICHAR(STRING(I:I))-48

IF (J .GE. 0 .AND. J .LT. 10) THEN

DCONV=DCONV*10D0+J

IF (IFDEC) NDEC=NDEC+1

ELSEIF (J.EQ.MINUS .AND. DCONV .EQ. 0D0 .AND. .NOT. IFDEC .AND. I .NE. L) THEN

NEG = .TRUE.

ELSEIF (J .EQ. PERIOD) THEN

IF (IFDEC .OR. L .EQ. 1) GO TO 40

IFDEC=.TRUE.

NDEC = 0

IF (DCONV .EQ. 0D0 .AND. I .EQ. L) THEN

IF (L.EQ.1) GO TO 40

IF (STRING(I-1:I-1) .NE. '0') GO TO 40

ENDIF

ELSEIF (STRING(I:I) .EQ. 'E' .OR. STRING(I:I) .EQ. 'D') THEN

IF (I .EQ. L) GO TO 40

IEXP=ICONV(STRING(I+1:L),IERR)

IF (IERR) GO TO 40

DCONV = DCONV*(10D0**IEXP)

EXIT

ELSEIF (I .EQ. L .OR. DCONV .GT. 0D0 .OR. (J .NE. BLANK .AND. J .NE. PLUS)) THEN

GO TO 40

ENDIF

END DO

IF (NEG) DCONV = -DCONV

IF (IFDEC) DCONV=DCONV/(10D0**NDEC)

RETURN

40 IERR = .TRUE.

50 DCONV=1600000000D0

END