- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We use the TRANSFER function to move integers into a real array and then later pull them back out as integers. Normally, we get back out what we put in, but using the switches /debug:full /arch:ia32 /fpe:1 it seems that the TRANSFER of an integer into the real causes an underflow, which then sets it 0 due to the /fpe:1 switch.
We don't see this happen with any other /arch: value. We are attempting to use the /arch:ia32 switch to build a non-processor specific version of our code.
Here's a simple test showing the problem built with /arch:ia32 and /arch:pn1
Thanks
John
D:\\jdl>type transfer_test.f90
PROGRAM transfer_test
integer nx,i
real xxx
nx = 10
xxx = TRANSFER(0,xxx)
i = TRANSFER(xxx,i)
write (6,*) i
xxx = TRANSFER(10,xxx)
i = TRANSFER(xxx,i)
write (6,*) i
xxx = TRANSFER(nx,xxx)
i = TRANSFER(xxx,i)
write (6,*) i
STOP
END
D:\\jdl>ifort /debug:full /arch:ia32 /fpe:1 transfer_test.f90
Intel Visual Fortran Compiler Professional for applications running on IA-32, Version 11.1 Build 20101201 Package
ID: w_cprof_p_11.1.072
Copyright (C) 1985-2010 Intel Corporation. All rights reserved.
Microsoft Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.
-out:transfer_test.exe
-debug
-pdb:transfer_test.pdb
-subsystem:console
transfer_test.obj
D:\\jdl>transfer_test.exe
0
0
0
D:\\jdl>ifort /debug:full /arch:pn1 /fpe:1 transfer_test.f90
Intel Visual Fortran Compiler Professional for applications running on IA-32, Version 11.1 Build 20101201 Package
ID: w_cprof_p_11.1.072
Copyright (C) 1985-2010 Intel Corporation. All rights reserved.
Microsoft Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.
-out:transfer_test.exe
-debug
-pdb:transfer_test.pdb
-subsystem:console
transfer_test.obj
D:\\jdl>transfer_test.exe
0
10
10
D:\\jdl>
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/arch:pn1 is effectively /arch:SSE2 in the 11.1 compiler and it uses MOVSS instructions that don't have this effect. Generally, using reals to store non-real data leaves you open to the data changing. Another change can be if the value "looks like" a signaling NaN, the FLD will change it to a quiet NaN, flipping a bit.
In other words, don't do this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
real(4), parameter :: Bias = 2**23
integer(4), parameter :: MantissaMask= Z'007FFFFF'
...
iArray(i) = IAND(TRANSFER((Array(i) + Bias), i),MantissaMask)
Array(i) = TRANSFER(IOR(iArray(i), TRANSFER(Bias, i)), Bias) - Bias
I haven't checked on the code generation. The code optimization should be able to reduce the first statement to a load, add, and, store and may be vectorizable provided these IAND and TRANSFER are recognized as vectorizable in this case. QED to write an SSE3 C helper routine to do this 4-floats at a time. The second statement should be a load, or, subtract, store.
Handling signed numbers and truncation vs. rounding can be easily added.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
However, we're seeing some other odd behavior with the combination of /arch:ia32 and /fpe:1. If we use /fpe:3 all seems well.
Regards,
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Be careful, what seems well to you now may blow up for the next person later.
If at a later date, your successor adds code to manipulate these integer bit patterns in a real array (as reals), then these numbers will be considered denormalized FP vlaues when integer is + and less than 2**23, or may be treated as SNaN or QNaN when negative, or other reserved FP value with different vlaues. And if you are not going to manipulate these numbers (other than binary write) try to remove the storage into a REAL array.
The code I presented earlier (adding Bias of 2**23 at conversion from integer to real, for positive integernumbers in range of 0-2**23-1) will permit you to manipulate the numbers as real without bunging up the value, but does require removing the bias on conversion from real back to integer)
If you have a large array for conversion, then I suggest you write a C/C++ function to perform the conversion since you can assure that SSE instructions are used. Something like this in your loop:
_mm_storeu_si128(
&rArray, // output array address
_mm_add_epi32(
&iArray,// input array address
BiasAs4int32)); // 4-up bit pattern of 2**23
The above can be reduced to 3 instructions toconvert 4ints to floats
To convert the other way (real to integer)you could use subtract or and.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We need to fix our code.
-John

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page