<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Assuming you're writing 64bit in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/mm-extract-ps-returns-int-for-a-long-long-time/m-p/1145390#M7778</link>
    <description>&lt;P&gt;Assuming you're writing 64bit code, then floats are stored in xmm registers anyway.&lt;/P&gt;&lt;P&gt;So really want you want is a vector register shuffle to just move the floating point value into the bottom of the vector register and then to use that register in scalar mode.&lt;/P&gt;&lt;P&gt;See doug65536's answer here;&lt;/P&gt;&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/5526658/intel-sse-why-does-mm-extract-ps-return-int-instead-of-float" target="_blank"&gt;https://stackoverflow.com/questions/5526658/intel-sse-why-does-mm-extract-ps-return-int-instead-of-float&lt;/A&gt;&lt;/P&gt;&lt;P&gt;So something like;&lt;/P&gt;&lt;P&gt;template &amp;lt;int i&amp;gt; float get() const noexcept { return&amp;nbsp;_mm_cvtss_f32(_mm_shuffle_ps(xmm_, xmm_, _MM_SHUFFLE(0, 0, 0, i))); }&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 04 Feb 2019 15:23:00 GMT</pubDate>
    <dc:creator>Richard_Nutman</dc:creator>
    <dc:date>2019-02-04T15:23:00Z</dc:date>
    <item>
      <title>_mm_extract_ps returns int (for a long long time)</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/mm-extract-ps-returns-int-for-a-long-long-time/m-p/1145389#M7777</link>
      <description>&lt;P&gt;Hello.&lt;/P&gt;&lt;P&gt;This issue looks like bad design or bug for a lot of programmers for many years. But problem&amp;nbsp;is still there.&lt;/P&gt;&lt;P&gt;Why _mm_extract_ps returns int type? At first we can see intrinsics design features like _ps and _epi32 endings for float and int types respectively. We have _mm_extract_epi32 which calls pextrd&amp;nbsp;instruction which return int type. And _mm_extract_ps uses extractps and return INT type again? But why? Will somebody fix it some day?&lt;/P&gt;&lt;P&gt;I want to write code like&lt;/P&gt;
&lt;PRE class="brush:; class-name:dark;"&gt;template &amp;lt;int i&amp;gt; float get() const noexcept { return _mm_extract_ps(xmm_, i);&amp;nbsp;}&lt;/PRE&gt;

&lt;P&gt;and not like&lt;/P&gt;

&lt;PRE class="brush:; class-name:dark;"&gt;template &amp;lt;int i&amp;gt; float get() const noexcept {
    int v = _mm_extract_ps(xmm_, i);
    float f;
    memcpy(&amp;amp;f, &amp;amp;v, sizeof(v)); // standard recommended cross-compiler type-punning for c++
    return f;
}&lt;/PRE&gt;

&lt;P&gt;P.S. Also maybe somebody can explain why we need both extractps and pextrd&amp;nbsp;assembly intructions when technically they are the same? I don't think they change some flags or do some checks anyway. Now I can't see the difference with&lt;/P&gt;

&lt;PRE class="brush:cpp; class-name:dark; wrap-lines:false;"&gt;int _mm_extract_ps(__m128 xmm, int i) { return _mm_extract_epi32(_mm_castps_si128(xmm), i); }&lt;/PRE&gt;

&lt;P&gt;Best regards, Vyacheslav&lt;/P&gt;</description>
      <pubDate>Sat, 02 Feb 2019 13:50:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/mm-extract-ps-returns-int-for-a-long-long-time/m-p/1145389#M7777</guid>
      <dc:creator>Meshkov__Vyacheslav</dc:creator>
      <dc:date>2019-02-02T13:50:52Z</dc:date>
    </item>
    <item>
      <title>Assuming you're writing 64bit</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/mm-extract-ps-returns-int-for-a-long-long-time/m-p/1145390#M7778</link>
      <description>&lt;P&gt;Assuming you're writing 64bit code, then floats are stored in xmm registers anyway.&lt;/P&gt;&lt;P&gt;So really want you want is a vector register shuffle to just move the floating point value into the bottom of the vector register and then to use that register in scalar mode.&lt;/P&gt;&lt;P&gt;See doug65536's answer here;&lt;/P&gt;&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/5526658/intel-sse-why-does-mm-extract-ps-return-int-instead-of-float" target="_blank"&gt;https://stackoverflow.com/questions/5526658/intel-sse-why-does-mm-extract-ps-return-int-instead-of-float&lt;/A&gt;&lt;/P&gt;&lt;P&gt;So something like;&lt;/P&gt;&lt;P&gt;template &amp;lt;int i&amp;gt; float get() const noexcept { return&amp;nbsp;_mm_cvtss_f32(_mm_shuffle_ps(xmm_, xmm_, _MM_SHUFFLE(0, 0, 0, i))); }&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Feb 2019 15:23:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/mm-extract-ps-returns-int-for-a-long-long-time/m-p/1145390#M7778</guid>
      <dc:creator>Richard_Nutman</dc:creator>
      <dc:date>2019-02-04T15:23:00Z</dc:date>
    </item>
    <item>
      <title>Sorry but please now such</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/mm-extract-ps-returns-int-for-a-long-long-time/m-p/1145391#M7779</link>
      <description>&lt;P&gt;Sorry but please no&amp;nbsp;such assumings. I need to use SIMD code on x86, x64 with&amp;nbsp;cross-compilers and platforms (win, lin, mac).&lt;/P&gt;&lt;P&gt;Thank you for link anyway. I found _MM_EXTRACT_FLOAT as&amp;nbsp;official solution, that's pretty interesting and fun. For me it looks like bad design. Still wonder&amp;nbsp;to know the reason for this&amp;nbsp;solution.&lt;/P&gt;&lt;P&gt;I don't think that using PORT5 is a good idea anyway. Maybe shift solution is more simple and faster for CPU to perform:&lt;/P&gt;
&lt;PRE class="brush:; class-name:dark;"&gt;template&amp;lt;int i&amp;gt; [[nodiscard]] float __vectorcall _mm_get_ps(__m128 v) {
    return _mm_cvtss_f32(_mm_castsi128_ps(_mm_srli_si128(_mm_castps_si128(x), i * 4)));
}

&lt;/PRE&gt;</description>
      <pubDate>Mon, 04 Feb 2019 16:42:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/mm-extract-ps-returns-int-for-a-long-long-time/m-p/1145391#M7779</guid>
      <dc:creator>Meshkov__Vyacheslav</dc:creator>
      <dc:date>2019-02-04T16:42:00Z</dc:date>
    </item>
  </channel>
</rss>

