- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't know this in advance, so I read these data as characters and add a decimal dot before aninternal read.
I.e. after reading in I see what I get and thentread each number as real further on,
while I would like to make the distinction between integer and real.
The data processingmust beextremely fast, so I cannot split my whole program in parts...
"if number = integer then process large parts of the program this way..."
"if number = real then process large parts of the program that way..."
Without these "if"s I want toprocess these numbers in the correct way
Is there a (binary) datatype- or pointer-technique to solve this problem ?
Something like
type BothNum
integer(4) :: IntNum
real(4):: RealNum
integer(4), pointer :: PointNum ! pointing to the correct datatype as soon as I read and parsed a number
end type BothNum
Thanks, Clemens (davinci)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Clemens,
Are the "process large parts of the program this way" expressing the same algorithm?
That is to say one pathusing integers and the other using reals. Such that the results forinput 1234 taking the integer path) are the same for input 1234. taking the floating point path. But where the integer path executes faster.
What is the largest integer?
What is the largest real?
What is the smalest real?
Are negative values in your input other than for termination?
Does your input contain 0 (or 0.0) other than for termination?
Can you make the integer/real determination once, then run two seperate paths through your program (reduce all the "if number ==..." to one test for the life of the number?
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Clemens,
Are the "process large parts of the program this way" expressing the same algorithm?
That is to say one pathusing integers and the other using reals. Such that the results forinput 1234 taking the integer path) are the same for input 1234. taking the floating point path. But where the integer path executes faster.
What is the largest integer?
What is the largest real?
What is the smalest real?
Are negative values in your input other than for termination?
Does your input contain 0 (or 0.0) other than for termination?
Can you make the integer/real determination once, then run two seperate paths through your program (reduce all the "if number ==..." to one test for the life of the number?
Jim Dempsey
Jim, thanks for your thoughts,
There is one hughe algoritm only to process both integer(4) or real(4).
Both unexpected on each input line and than have to be recognized at reading in and parsing the input.
Int(4) and real(4) only, both could be negative too and none astermination value.
The algorithm is very complex and will be in a further development stage for a while.
No way to determine the int/real situationfor large parts of the program, the mixed solutionserves the whole part.
Copying the algoritm into 2 parts (int/real) is therefore difficult, the input is the problem.
Therefore my thought to read-in in an uniformal way (see the type constructin my previous suggestion, or some otherbinary form). Then to parse the input and "point" to the correct datatype and do the next processing using this pointed mechanism.
It is the processing part after reading the data which is too complex to split in an integer and real part.
OK, I know this question is something for Donald Knuth.
(I followed his lecturesquite a while ago,but forgot to ask this question).
It's an interesting one for the theme "Algoritms & Datastructures in Fortran".
Cheers Clemens (Davinci).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Clemens,
What is the range of the valid integer numbers?
Does 0 and 0.0 produce the same results (i.e. are interchangable).
Excepting for 0 having the same binary pattern as 0.0 (for same sized numbers integer(4)/real(4), or integer(8)/real(8)) the floating point normal ranged numbers appear to have rather large absolute values when viewed as integers.
For REAL(4), anything that is not a "funny" number (NaN, infinities, Denormalized, etc...) will have an absolute magnitude .ge. 2**23.
So, if your integer numbers are within this range then a union of the INTEGER(4) and REAL(4) can be used in place of your three variable structure. This reduces data size, and improves cache hit ratios.
type compositeVar
union
map
integer :: asINT4
end map
map
real :: asREAL4
end map
end union
end type compositeVar
make an inline logical function to determine if the variable is integer or real
logical functionisReal(v)
type(compositeVar)::v
isReal = (iand((v%asINT4+Z'00800000), Z'FF000000') .ne. 0)
end function isReal
the above may be faster than isReal = (iabs(v%asINT4) .ge. Z'0800000'))
but you can try both ways.
Run rigorous test to verify your full input range does not exceed the limitations
Then in your main code
if(isReal(VAR%asINT)) then
! do real part using VAR%asREAL4
else
! do integer part VAR%asINT4
endif
if 0 and 0.0 produce different results and need to be seperated
Then does 0.0 and -0.0 produce the same results?
if so, when you read 0.0, set the integer part to Z'80000000' this is a float -0.0.
Do not try to set VAR%asREAL4 = -0.0 as this will set it to 0.0.
You can also make inline conversion functions
real function asReal(v)
type(compositeVar)::v
if(isReal(v)) return v%asREAL4
return REAL(v%asINT4)
end function asReal
and the other one for asINT4
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Clemens,
Here is an alternate suggestion, although it is unconventional it will get the speed you want.
Take your current subroutine, e.g. FOO.F90, and copy it to FOO.INC
Convert the FOO.INC into a polymorphic source file who's identity changed by use of Fortran Preprocessor (FPP)
#define xxx yyy
and
#ifdef ...
With one define, the single source file compiles into the integer path
With the second define, the single source file compiles onto the REAL path.
You now can useFPP to create both files
subroutine foo(v)
use mod_composite
if(isREAL(v)) then
call fooREAL(v%asREAL4)
else
call fooINT(v%asINT4)
endif
end subroutine foo
subroutine fooREAL(v)
real(4) :: v
#define USE_INT
#include "FOO.INC"
#undef USE_INT
end subroutine fooREAL
subroutine fooINT(v)
integer(4) :: v
#define USE_INT
#include "FOO.INC
#undef USE_INT
end subroutine fooINT
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Why not look for a decimal point FIRST?
If the numbers are integers (either real or integer type) you can safely assume they can be treated as integers, which would be fastest. You can process the file once to either add or not add a decimal point if they look like integers.
So one pass thru the file would be sufficient. If they are mixed decimal point or no decimal point, then it's safest to assume they are ALL real. If their magnitude is quite large you can use REAL (kind=8) or (kind=16).
Otherwise treat them as ALL integer.
Do we know the range? That would determine the KIND= part of the integer statement.
I had a similar problem when I worked for JPL, but I just treated everything as REAL (kind=8).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is VERY fast. I assume a 40 character input record.
[cpp] subroutine real_or_int(rec) implicit NONE integer (kind=8)inum integer (KIND=2) i real (kind=8) xnum character (len=40) rec 101 format(I40) 102 format(G40.12) do i=1,40 if(rec(i:i).eq.'.')go to 20 end do ! here if NO decimal point read(rec,101)inum print *,inum return ! here if dec pt. found 20 read(rec,102)xnum print *,xnum end subroutine [/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From my understanding of the original poster's request, the data file has a mix of integers and reals.
Depending on if the input is integer (sans . and sans maybe E+nn) or real (with . and maybe E+nn) the program behaves entirely different.
The user does not wish to write two seperate subroutines (or collection of subroutines) as this make maintenance difficult. The two code paths are similar but not the same (from my understanding). The user stated (per query from me) that different results occur when input has an integer as opposed to a real of the same numeric value as the integer (e.g. 123 and 123.). At least that is what I asked and the different results is the response I think I got.
If the numbers through differing code paths do produce the same results then the user may find better performance by making all input REAL(4) and do away with the numerous flow control statements. His code might be able to pickup additional SSE instruction usage and make up the difference in speed and then some.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If none of the quantities have a fractional part, the quickest convesion would be INTEGERS.
If the numbers appear in the file as BINARY (not as character data) then telling integers from floating point could be very difficult.
For example, take: Z'13422006'
Is that a very small floating point, or a very large integer?
The application that it's USED for would have to answer that question. If the quantity represents some MEASUREMENT, (i.e a physical dimension, atmospheric pressure, position coordinates) then it should be floating point.
Obviously, if it's given as character data and a D or E appears, it must be floating point.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It may help if you examine this document
http://steve.hollasch.net/cgindex/coding/ieeefloat.html
Excepting for 0.0, and -0.0all non-"funny" floating point numbers, that is to say non-(Denormalized Number), have non-0 in the exponent. For REAL(4) there is an 8-bit exponent, just below (right of) the sign bit. Therefore
(IAND(iVal, Z'7F800000) .ne. 0) == REAL
(IAND(iVal, Z'7F800000) .eq. 0) == INTEGER .or. (real value of 0.0)
(IAND(iVal, Z'7F800000) .eq. 0) .and. (IAND(iVal, Z'007FFFFF) .ne. 0) == INTEGER
As explained earlier, if your path through your code with iValue=0 produces the same result as path through your code with rVlaue=0.0 then you can use the middle test above to disambiguate the numbers (and use integer 0 as substitute for 0.0).
This test is very fast since IAND is intrinsic and will result in a single machineinstruction.
The original post, and follow-ups, indicated that inputs of 123 and 123.0 produced different results (different code paths taken).
Please note, study the code _before_ you make an assumption that 123 and 123.0 (and extention to all integral REALs) are equivilent. 123 could be entered when it is known that the the code path should not or will not produce a fractional part, whereas 123.0 could be entered when it is known that the the code path may or may be permitted to produce a fractional part. It would be presumptuous to assume differently.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And as explained earlier, the integer range must lay within 23 bitsof numeric range(with special considerations made for negative integer numbers)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you're reading a text representation, there is no need to try to figure out, from the bits, whether there is a fractional part or not. (I had a customer ask about reading binary data and trying to figure this out - as others have covered, there is no guaranteed way to do this unless you have strict limits on the values.)
If I was doing this, I'd simply read the value with a Gsomething.0 format (where "something" is the width of the string being read) into a double precision real. I'd then ask the musical question (fraction(x) == 0.0_8). If this is true, then the value can be converted to integer (with INT) and manipulated that way, otherwise leave it as a real. You might want to also test exponent(x) to make sure it is not too large (>31?) for an integer(4).
On the other hand, this all seems more complicated than I think is warranted for the hoped-for performance gains, but I don't know the application.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My first impression was that Clemens wanted to :
Read First_Number
If (First_Number == INTEGER) then
readall other numbers as INTEGER
call Integer_Process_Path
else
read all other numbers as REAL
call Real_Process_Path
endif
withoutduplicating the memory space occupied by the data (hencehis equivalence type)
But others replies suggested that the data was mixed integer and real, in which case the problem ismorecomplex and the test has to be done for each number read.
Perhaps Clemens could indicate what he wants to do in a psuedo code style as above ?
Les
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My first impression was that Clemens wanted to :
Read First_Number
If (First_Number == INTEGER) then
readall other numbers as INTEGER
call Integer_Process_Path
else
read all other numbers as REAL
call Real_Process_Path
endif
withoutduplicating the memory space occupied by the data (hencehis equivalence type)
But others replies suggested that the data was mixed integer and real, in which case the problem ismorecomplex and the test has to be done for each number read.
Perhaps Clemens could indicate what he wants to do in a psuedo code style as above ?
Les
All,
thanks for your fast responses and apologies for my very late reaction, I had to be offline for a while.
To formulate my question more generaly;
Many algorithms give the same output, but with different datatypes.
Thesecharacter output matriceshave a mix of integers and reals inevery possible combination per record.
Every record has a fixed size (100 columns ints and reals).
Value sizesare moderate (-10000...0...+10000), zero is a value too.
My program has to process every matrix and record in the same way.
This program is large, complex, still in development.
Yoursuggestions to determine the input first and run separate int/real parts isn't feasible with these combinations.
Billsincl suggestion to convert (character) integers to reals and tread all as real-only is what I did.
This is a workaround, actually I need to distinguish between int and real.
Therefore my search for some combined datatype and processing the proper datatype-part on the fly.
I'm going to experiment with Jim's sugestions (Type-Union-Map...) and post the results.
Thanks all,
Clemens (davinci)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page