<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic TRANSFER function seems to cause underflow when /arch:ia32 and /fpe:1 are set in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781847#M28129</link>
    <description>&lt;P&gt;We use the TRANSFER function to move integers into a real array and then later pull them back out as integers. Normally, we get back out what we put in, but using the switches /debug:full /arch:ia32 /fpe:1 it seems that the TRANSFER of an integer into the real causes an underflow, which then sets it 0 due to the /fpe:1 switch.&lt;BR /&gt;&lt;BR /&gt;We don't see this happen with any other /arch: value. We are attempting to use the /arch:ia32 switch to build a non-processor specific version of our code.&lt;BR /&gt;&lt;BR /&gt;Here's a simple test showing the problem built with /arch:ia32 and /arch:pn1&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;BR /&gt;John&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;type transfer_test.f90&lt;BR /&gt;PROGRAM transfer_test&lt;/P&gt;&lt;P&gt;integer nx,i&lt;BR /&gt;real xxx&lt;/P&gt;&lt;P&gt;nx = 10&lt;BR /&gt;xxx = TRANSFER(0,xxx)&lt;BR /&gt;i = TRANSFER(xxx,i)&lt;BR /&gt;write (6,*) i&lt;BR /&gt;xxx = TRANSFER(10,xxx)&lt;BR /&gt;i = TRANSFER(xxx,i)&lt;BR /&gt;write (6,*) i&lt;BR /&gt;xxx = TRANSFER(nx,xxx)&lt;BR /&gt;i = TRANSFER(xxx,i)&lt;BR /&gt;write (6,*) i&lt;/P&gt;&lt;P&gt;STOP&lt;BR /&gt;END&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;D:\\jdl&amp;gt;ifort /debug:full /arch:ia32 /fpe:1 transfer_test.f90&lt;BR /&gt;Intel Visual Fortran Compiler Professional for applications running on IA-32, Version 11.1 Build 20101201 Package&lt;BR /&gt;ID: w_cprof_p_11.1.072&lt;BR /&gt;Copyright (C) 1985-2010 Intel Corporation. All rights reserved.&lt;/P&gt;&lt;P&gt;Microsoft  Incremental Linker Version 9.00.30729.01&lt;BR /&gt;Copyright (C) Microsoft Corporation. All rights reserved.&lt;/P&gt;&lt;P&gt;-out:transfer_test.exe&lt;BR /&gt;-debug&lt;BR /&gt;-pdb:transfer_test.pdb&lt;BR /&gt;-subsystem:console&lt;BR /&gt;transfer_test.obj&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;transfer_test.exe&lt;BR /&gt; 0&lt;BR /&gt; 0&lt;BR /&gt; 0&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;ifort /debug:full /arch:pn1 /fpe:1 transfer_test.f90&lt;BR /&gt;Intel Visual Fortran Compiler Professional for applications running on IA-32, Version 11.1 Build 20101201 Package&lt;BR /&gt;ID: w_cprof_p_11.1.072&lt;BR /&gt;Copyright (C) 1985-2010 Intel Corporation. All rights reserved.&lt;/P&gt;&lt;P&gt;Microsoft  Incremental Linker Version 9.00.30729.01&lt;BR /&gt;Copyright (C) Microsoft Corporation. All rights reserved.&lt;/P&gt;&lt;P&gt;-out:transfer_test.exe&lt;BR /&gt;-debug&lt;BR /&gt;-pdb:transfer_test.pdb&lt;BR /&gt;-subsystem:console&lt;BR /&gt;transfer_test.obj&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;transfer_test.exe&lt;BR /&gt; 0&lt;BR /&gt; 10&lt;BR /&gt; 10&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;&lt;/P&gt;</description>
    <pubDate>Wed, 27 Jul 2011 17:55:15 GMT</pubDate>
    <dc:creator>John_Leonard</dc:creator>
    <dc:date>2011-07-27T17:55:15Z</dc:date>
    <item>
      <title>TRANSFER function seems to cause underflow when /arch:ia32 and /fpe:1 are set</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781847#M28129</link>
      <description>&lt;P&gt;We use the TRANSFER function to move integers into a real array and then later pull them back out as integers. Normally, we get back out what we put in, but using the switches /debug:full /arch:ia32 /fpe:1 it seems that the TRANSFER of an integer into the real causes an underflow, which then sets it 0 due to the /fpe:1 switch.&lt;BR /&gt;&lt;BR /&gt;We don't see this happen with any other /arch: value. We are attempting to use the /arch:ia32 switch to build a non-processor specific version of our code.&lt;BR /&gt;&lt;BR /&gt;Here's a simple test showing the problem built with /arch:ia32 and /arch:pn1&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;BR /&gt;John&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;type transfer_test.f90&lt;BR /&gt;PROGRAM transfer_test&lt;/P&gt;&lt;P&gt;integer nx,i&lt;BR /&gt;real xxx&lt;/P&gt;&lt;P&gt;nx = 10&lt;BR /&gt;xxx = TRANSFER(0,xxx)&lt;BR /&gt;i = TRANSFER(xxx,i)&lt;BR /&gt;write (6,*) i&lt;BR /&gt;xxx = TRANSFER(10,xxx)&lt;BR /&gt;i = TRANSFER(xxx,i)&lt;BR /&gt;write (6,*) i&lt;BR /&gt;xxx = TRANSFER(nx,xxx)&lt;BR /&gt;i = TRANSFER(xxx,i)&lt;BR /&gt;write (6,*) i&lt;/P&gt;&lt;P&gt;STOP&lt;BR /&gt;END&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;D:\\jdl&amp;gt;ifort /debug:full /arch:ia32 /fpe:1 transfer_test.f90&lt;BR /&gt;Intel Visual Fortran Compiler Professional for applications running on IA-32, Version 11.1 Build 20101201 Package&lt;BR /&gt;ID: w_cprof_p_11.1.072&lt;BR /&gt;Copyright (C) 1985-2010 Intel Corporation. All rights reserved.&lt;/P&gt;&lt;P&gt;Microsoft  Incremental Linker Version 9.00.30729.01&lt;BR /&gt;Copyright (C) Microsoft Corporation. All rights reserved.&lt;/P&gt;&lt;P&gt;-out:transfer_test.exe&lt;BR /&gt;-debug&lt;BR /&gt;-pdb:transfer_test.pdb&lt;BR /&gt;-subsystem:console&lt;BR /&gt;transfer_test.obj&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;transfer_test.exe&lt;BR /&gt; 0&lt;BR /&gt; 0&lt;BR /&gt; 0&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;ifort /debug:full /arch:pn1 /fpe:1 transfer_test.f90&lt;BR /&gt;Intel Visual Fortran Compiler Professional for applications running on IA-32, Version 11.1 Build 20101201 Package&lt;BR /&gt;ID: w_cprof_p_11.1.072&lt;BR /&gt;Copyright (C) 1985-2010 Intel Corporation. All rights reserved.&lt;/P&gt;&lt;P&gt;Microsoft  Incremental Linker Version 9.00.30729.01&lt;BR /&gt;Copyright (C) Microsoft Corporation. All rights reserved.&lt;/P&gt;&lt;P&gt;-out:transfer_test.exe&lt;BR /&gt;-debug&lt;BR /&gt;-pdb:transfer_test.pdb&lt;BR /&gt;-subsystem:console&lt;BR /&gt;transfer_test.obj&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;transfer_test.exe&lt;BR /&gt; 0&lt;BR /&gt; 10&lt;BR /&gt; 10&lt;/P&gt;&lt;P&gt;D:\\jdl&amp;gt;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2011 17:55:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781847#M28129</guid>
      <dc:creator>John_Leonard</dc:creator>
      <dc:date>2011-07-27T17:55:15Z</dc:date>
    </item>
    <item>
      <title>TRANSFER function seems to cause underflow when /arch:ia32 and</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781848#M28130</link>
      <description>Yep - I can believe it. When you use /arch:IA32, the compiler uses the x87 FLD and FSTP instructions to move the result of the TRANSFER into the variable. Since an integer 10 looks like a denormalized value, it gets flushed to zero with /fpe1.&lt;BR /&gt;&lt;BR /&gt;/arch:pn1 is effectively /arch:SSE2 in the 11.1 compiler and it uses MOVSS instructions that don't have this effect. Generally, using reals to store non-real data leaves you open to the data changing. Another change can be if the value "looks like" a signaling NaN, the FLD will change it to a quiet NaN, flipping a bit.&lt;BR /&gt;&lt;BR /&gt;In other words, don't do this.</description>
      <pubDate>Wed, 27 Jul 2011 18:49:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781848#M28130</guid>
      <dc:creator>Steven_L_Intel1</dc:creator>
      <dc:date>2011-07-27T18:49:55Z</dc:date>
    </item>
    <item>
      <title>TRANSFER function seems to cause underflow when /arch:ia32 and</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781849#M28131</link>
      <description>If your REAL(4) numbers are positive integers .OR. if you want rounded up values of positive reals then consider&lt;BR /&gt;&lt;BR /&gt;real(4), parameter :: Bias = 2**23&lt;BR /&gt;integer(4), parameter :: MantissaMask= Z'007FFFFF'&lt;BR /&gt;...&lt;BR /&gt;iArray(i) = &lt;SPAN style="color: #0000ff; font-size: x-small;"&gt;&lt;SPAN style="color: #0000ff; font-size: x-small;"&gt;IAND&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="font-size: x-small;"&gt;(&lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; font-size: x-small;"&gt;&lt;SPAN style="color: #0000ff; font-size: x-small;"&gt;TRANSFER&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="font-size: x-small;"&gt;((Array(i) + Bias), i),MantissaMask)&lt;BR /&gt;&lt;BR /&gt;Array(i) = TRANSFER(IOR(iArray(i), TRANSFER(Bias, i)), Bias) - Bias&lt;BR /&gt;&lt;BR /&gt;I haven't checked on the code generation. The code optimization should be able to reduce the first statement to a load, add, and, store and may be vectorizable provided these IAND and TRANSFER are recognized as vectorizable in this case. QED to write an SSE3 C helper routine to do this 4-floats at a time. The second statement should be a load, or, subtract, store.&lt;BR /&gt;&lt;BR /&gt;Handling signed numbers and truncation vs. rounding can be easily added.&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey&lt;/SPAN&gt;</description>
      <pubDate>Wed, 27 Jul 2011 20:35:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781849#M28131</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2011-07-27T20:35:38Z</dc:date>
    </item>
    <item>
      <title>TRANSFER function seems to cause underflow when /arch:ia32 and</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781850#M28132</link>
      <description>Actually, I think we're just using TRANSFER like we used to use equivalence or map structures. I don't know why we're using a general storage area defined as real but I'm sure there's a very good reason!&lt;BR /&gt;&lt;BR /&gt;However, we're seeing some other odd behavior with the combination of /arch:ia32 and /fpe:1. If we use /fpe:3 all seems well.&lt;BR /&gt;&lt;BR /&gt;Regards,&lt;BR /&gt;John</description>
      <pubDate>Wed, 27 Jul 2011 21:48:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781850#M28132</guid>
      <dc:creator>John_Leonard</dc:creator>
      <dc:date>2011-07-27T21:48:18Z</dc:date>
    </item>
    <item>
      <title>TRANSFER function seems to cause underflow when /arch:ia32 and</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781851#M28133</link>
      <description>&amp;gt;&amp;gt;If we use /fpe:3 all seems well.&lt;BR /&gt;&lt;BR /&gt;Be careful, what seems well to you now may blow up for the next person later.&lt;BR /&gt;&lt;BR /&gt;If at a later date, your successor adds code to manipulate these integer bit patterns in a real array (as reals), then these numbers will be considered denormalized FP vlaues when integer is + and less than 2**23, or may be treated as SNaN or QNaN when negative, or other reserved FP value with different vlaues. And if you are not going to manipulate these numbers (other than binary write) try to remove the storage into a REAL array.&lt;BR /&gt;&lt;BR /&gt;The code I presented earlier (adding Bias of 2**23 at conversion from integer to real, for positive integernumbers in range of 0-2**23-1) will permit you to manipulate the numbers as real without bunging up the value, but does require removing the bias on conversion from real back to integer)&lt;BR /&gt;&lt;BR /&gt;If you have a large array for conversion, then I suggest you write a C/C++ function to perform the conversion since you can assure that SSE instructions are used. Something like this in your loop:&lt;BR /&gt;&lt;BR /&gt; _mm_storeu_si128(&lt;BR /&gt;&amp;amp;rArray&lt;I&gt;, // output array address&lt;BR /&gt;_mm_add_epi32(&lt;BR /&gt; &amp;amp;iArray&lt;I&gt;,// input array address&lt;BR /&gt; BiasAs4int32)); // 4-up bit pattern of 2**23&lt;BR /&gt;&lt;BR /&gt;The above can be reduced to 3 instructions toconvert 4ints to floats&lt;BR /&gt;&lt;BR /&gt;To convert the other way (real to integer)you could use subtract or and.&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey&lt;/I&gt;&lt;/I&gt;</description>
      <pubDate>Thu, 28 Jul 2011 16:30:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781851#M28133</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2011-07-28T16:30:37Z</dc:date>
    </item>
    <item>
      <title>TRANSFER function seems to cause underflow when /arch:ia32 and</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781852#M28134</link>
      <description>Jim, thanks for the feedback. After looking more closely at TRANSFER and how it behaves with different parameters I don't think we're using it correctly for what we are intending to do. To that I have to agree with Steve's advice - "Don't do that!", as we really don't want these int -&amp;gt; real -&amp;gt; int conversions happening and we can see the unpredictable side effects depending on compile options.&lt;BR /&gt;&lt;BR /&gt;We need to fix our code.&lt;BR /&gt;&lt;BR /&gt;-John</description>
      <pubDate>Thu, 28 Jul 2011 19:09:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/TRANSFER-function-seems-to-cause-underflow-when-arch-ia32-and/m-p/781852#M28134</guid>
      <dc:creator>John_Leonard</dc:creator>
      <dc:date>2011-07-28T19:09:37Z</dc:date>
    </item>
  </channel>
</rss>

