<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic There certainly have been CPU in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095846#M7262</link>
    <description>&lt;P&gt;There certainly have been CPU models where the built-in string moves didn't qualify as "fast," so Intel compiler developers devoted significant effort to make their compilers choose well. More recent CPUs were designed to overcome performance deficits associated with legacy choices.&amp;nbsp; It's reasonable to hope that clearly written portable source will be optimized adequately until performance profiling shows otherwise.&lt;/P&gt;

&lt;P&gt;Intel also devoted effort to fix obvious deficiencies in memmove/memcpy/memset provided by OS so there aren't so many problems there as in the past.&amp;nbsp; When using compilers other than Intel's, you may need to call such functions explicitly if you wish to engage automatic run-time selection of streaming/nontemporal store.&lt;/P&gt;</description>
    <pubDate>Mon, 15 Aug 2016 15:11:10 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2016-08-15T15:11:10Z</dc:date>
    <item>
      <title>Fast-string operation and Non-temporal access</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095845#M7261</link>
      <description>&lt;P&gt;Dear Experts,&lt;/P&gt;

&lt;P&gt;We have fast-string operation (REP MOVSB/STOSB) and non-temporal access (NTA) in modern hardware (CPU). Which one do you prefer&amp;nbsp;for memory copy/fill (without considering other DMA resources in the system)?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Jeremy&lt;/P&gt;</description>
      <pubDate>Sat, 13 Aug 2016 06:16:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095845#M7261</guid>
      <dc:creator>JWong19</dc:creator>
      <dc:date>2016-08-13T06:16:40Z</dc:date>
    </item>
    <item>
      <title>There certainly have been CPU</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095846#M7262</link>
      <description>&lt;P&gt;There certainly have been CPU models where the built-in string moves didn't qualify as "fast," so Intel compiler developers devoted significant effort to make their compilers choose well. More recent CPUs were designed to overcome performance deficits associated with legacy choices.&amp;nbsp; It's reasonable to hope that clearly written portable source will be optimized adequately until performance profiling shows otherwise.&lt;/P&gt;

&lt;P&gt;Intel also devoted effort to fix obvious deficiencies in memmove/memcpy/memset provided by OS so there aren't so many problems there as in the past.&amp;nbsp; When using compilers other than Intel's, you may need to call such functions explicitly if you wish to engage automatic run-time selection of streaming/nontemporal store.&lt;/P&gt;</description>
      <pubDate>Mon, 15 Aug 2016 15:11:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095846#M7262</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2016-08-15T15:11:10Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt;...Which one do you prefer</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095847#M7263</link>
      <description>&amp;gt;&amp;gt;...Which one do you prefer for memory copy/fill (without considering other DMA resources in the system)?

Let me answer in as generic as possible way...

1. If in a &lt;STRONG&gt;Use Case A&lt;/STRONG&gt; the function A is faster than function B then use function A.

2. If in a &lt;STRONG&gt;Use Case B&lt;/STRONG&gt; the function B is faster than function A then use function B.

3. If some function is &lt;STRONG&gt;always&lt;/STRONG&gt; faster than another one then use that function.

and so on.</description>
      <pubDate>Wed, 14 Sep 2016 23:55:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095847#M7263</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-09-14T23:55:49Z</dc:date>
    </item>
    <item>
      <title>If the strings are indeed</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095848#M7264</link>
      <description>&lt;P&gt;If the strings are indeed byte strings at arbitrary byte offsets in both source and destinaton, and if the strings are relatively short, rough guess of less than 256 bytes, then the rep movsb/stosb (questionably)&amp;nbsp;may be a good choice. You will have to run some tests. And because the tests are to be used for you to make a decision, be sure that your tests are set up to provide representative results for the situations you encounter (IOW not a contrived prove your point test).&lt;/P&gt;

&lt;P&gt;FWIW I do agree that some optimization effort should be made to favor rep movsb/stosb over using "gobs" of registers (and avoid save/restore or discarding values).&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Thu, 15 Sep 2016 12:38:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Fast-string-operation-and-Non-temporal-access/m-p/1095848#M7264</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2016-09-15T12:38:56Z</dc:date>
    </item>
  </channel>
</rss>

