<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi Isaias,
what the udfs are? in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969055#M2773</link>
    <description>&lt;P&gt;Hi Isaias,&lt;/P&gt;
&lt;P&gt;what the udfs are?&lt;/P&gt;
&lt;P&gt;Regarding your disabling cache experiment I think that at least you will be able to test speculative and out-of-order execution of non-dependent code when your CPU will be waiting for data arrival.&lt;/P&gt;</description>
    <pubDate>Thu, 08 Aug 2013 17:27:00 GMT</pubDate>
    <dc:creator>Bernard</dc:creator>
    <dc:date>2013-08-08T17:27:00Z</dc:date>
    <item>
      <title>Preventing cache on Intel i7 Sandy Bridge</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969048#M2766</link>
      <description>&lt;P&gt;We are trying to disable memory cache, by following the Developer manual Intel in chapter 11 &lt;A href="http://goo.gl/ufvzA"&gt;http://goo.gl/ufvzA&lt;/A&gt; setting CD bit in control register (CR0) when kernel is booting. We have a doubt about our code.&amp;nbsp; Such code has the effect of slowing down execution. We're using Debian Linux 64-bit. Does it disable the cache of all cores or only one?&amp;nbsp;Does it disable the cache of all levels (L1, L2 and L3 ) or only one? Any idea?&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;/*This code (Gas sintaxs) was inserted into file main.c on the Linux kernel booting*/&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("push %rax"); ; save eax&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("cli");// ; disable interrupts while we do this&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("movq %cr0, %rax");// ; read CR0&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("or $0x40000000, %rax");// ; set CD but not NW bit of CR0&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("movq %rax, %cr0");// ; cache is now disabled&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("wbinvd"); //flush &lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("or $0x20000000, %rax");// ; now set the NW bit&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("movq %rax, %cr0"); // ; turn off the cache entirely&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;asm("pop %rax");// ; restore eax&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 05 Aug 2013 17:20:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969048#M2766</guid>
      <dc:creator>Isaias_Z_</dc:creator>
      <dc:date>2013-08-05T17:20:46Z</dc:date>
    </item>
    <item>
      <title>Hello Isaias,</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969049#M2767</link>
      <description>&lt;P&gt;Hello Isaias,&lt;/P&gt;
&lt;P&gt;Call me chicken but I don't want to click on the URL you've cited.&lt;/P&gt;
&lt;P&gt;Can you give a manual name,&amp;nbsp;manual date&amp;nbsp;and section and maybe a&amp;nbsp;page number for your reference?&lt;/P&gt;
&lt;P&gt;The less I have to waste time figuring out what you are talking about, the more likely you are to get an answer.&lt;/P&gt;
&lt;P&gt;Pat&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 05 Aug 2013 19:23:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969049#M2767</guid>
      <dc:creator>Patrick_F_Intel1</dc:creator>
      <dc:date>2013-08-05T19:23:11Z</dc:date>
    </item>
    <item>
      <title>Thanks for asking, The manual</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969050#M2768</link>
      <description>&lt;UL&gt;
&lt;LI&gt;Thanks for asking, The manual is:&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Intel® 64 and IA-32 Architectures&lt;BR /&gt;Software Developer’s Manual&lt;BR /&gt;Volume 3A:&lt;BR /&gt;System Programming Guide, Part 1&lt;/P&gt;
&lt;P&gt;In particular chapter 11.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Patrick Fay (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hello Isaias,&lt;/P&gt;
&lt;P&gt;Call me chicken but I don't want to click on the URL you've cited.&lt;/P&gt;
&lt;P&gt;Can you give a manual name,&amp;nbsp;manual date&amp;nbsp;and section and maybe a&amp;nbsp;page number for your reference?&lt;/P&gt;
&lt;P&gt;The less I have to waste time figuring out what you are talking about, the more likely you are to get an answer.&lt;/P&gt;
&lt;P&gt;Pat&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 06 Aug 2013 02:48:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969050#M2768</guid>
      <dc:creator>Isaias_Z_</dc:creator>
      <dc:date>2013-08-06T02:48:03Z</dc:date>
    </item>
    <item>
      <title>Thanks,</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969051#M2769</link>
      <description>&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;From table 11-5, It looks to me like setting the CD bit&amp;nbsp; (and clearing NW) in CR0 doesn't disable any of the caches (that is L1, L2, L3 if present are still present and active).&lt;/P&gt;
&lt;P&gt;But, unless a memory address is already in the cache before&amp;nbsp;you before set the CD bit, then all reads and writes will go to memory.&lt;/P&gt;
&lt;P&gt;So I would expect to see a massive slowdown. Atom chips are apparently slightly different. Atom (it is not really clear to me) seems to just disable any caching of data in L1/L2/L3 when CD=1.&lt;/P&gt;
&lt;P&gt;Is this the same as your understanding and are you seeing a massive slowdown?&lt;/P&gt;
&lt;P&gt;Dare I ask why you would want to do this?&lt;/P&gt;
&lt;P&gt;Pat&lt;/P&gt;</description>
      <pubDate>Tue, 06 Aug 2013 03:22:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969051#M2769</guid>
      <dc:creator>Patrick_F_Intel1</dc:creator>
      <dc:date>2013-08-06T03:22:05Z</dc:date>
    </item>
    <item>
      <title>Why do you want to disable</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969052#M2770</link>
      <description>&lt;P&gt;Why do you want to disable cache? Do you want to do it in order to make some comparision of cached performance&lt;/P&gt;</description>
      <pubDate>Tue, 06 Aug 2013 09:20:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969052#M2770</guid>
      <dc:creator>Bernard</dc:creator>
      <dc:date>2013-08-06T09:20:49Z</dc:date>
    </item>
    <item>
      <title>Thank Patric and</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969053#M2771</link>
      <description>&lt;P&gt;Thank Patric and Iliyapolak&lt;BR /&gt;the reason I want to switch off the cache is because I'm testing some udfs from the mkl library in postgres. The docs say the functions are tuned to use the cache to improve performance so I want to see the difference between running with and without cache, disabling it by sw. It is important to me to know exactly which cache are disabled (1 or 8 cores).&lt;BR /&gt;In fact, that is true, i get slow down but i want know what happen really? I'm reading the docs and checking the cpu info, etc.&lt;/P&gt;</description>
      <pubDate>Tue, 06 Aug 2013 18:21:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969053#M2771</guid>
      <dc:creator>Isaias_Z_</dc:creator>
      <dc:date>2013-08-06T18:21:16Z</dc:date>
    </item>
    <item>
      <title>I think that are thinking</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969054#M2772</link>
      <description>&lt;P&gt;I think that are thinking that you are disabling L1/L2 and/or L3. You aren't really. You are disabling the use of those caches. The system will behave as if you didn't have L1/L2/L3. Maybe this is a distinction without a difference.&lt;/P&gt;
&lt;P&gt;I think that the MKL&amp;nbsp;logic blocks the data that is fetched from memory so that&amp;nbsp;MKL get maximum reuse of the fetched data before&amp;nbsp;the data is bumped out of&amp;nbsp;the cache.&lt;/P&gt;
&lt;P&gt;If you ran a memory latency test on the system with a range of sizes (such as a size that should fit into L1 or L2 or L3 (if you have an L3)) then you should see the latency the same as for an array size that is much greater than your last level cache. That is, you should see the latency of what would normally be 'in cache' be as slow as something that is coming from memory.&lt;/P&gt;
&lt;P&gt;I'm not sure that disabling cacheing is a very good test of the MKL lib. By disabling ALL cacheing, you are more seeing "how slow can we make chip go". A better test might be to measure how much (or less) memory bandwidth is used before and after the UDFs. (I don't really know what udfs are... I assume they are patches or updates to the library).&lt;/P&gt;
&lt;P&gt;Pat&lt;/P&gt;</description>
      <pubDate>Tue, 06 Aug 2013 18:48:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969054#M2772</guid>
      <dc:creator>Patrick_F_Intel1</dc:creator>
      <dc:date>2013-08-06T18:48:28Z</dc:date>
    </item>
    <item>
      <title>Hi Isaias,
what the udfs are?</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969055#M2773</link>
      <description>&lt;P&gt;Hi Isaias,&lt;/P&gt;
&lt;P&gt;what the udfs are?&lt;/P&gt;
&lt;P&gt;Regarding your disabling cache experiment I think that at least you will be able to test speculative and out-of-order execution of non-dependent code when your CPU will be waiting for data arrival.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Aug 2013 17:27:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969055#M2773</guid>
      <dc:creator>Bernard</dc:creator>
      <dc:date>2013-08-08T17:27:00Z</dc:date>
    </item>
    <item>
      <title>Thanks.IliyapolakIn SQL</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969056#M2774</link>
      <description>&lt;P&gt;Thanks.&lt;BR /&gt;Iliyapolak&lt;BR /&gt;In SQL databases, a user-defined function provides a mechanism for extending the functionality of the DBMS by adding a function that can be evaluated in SQL statements.&lt;/P&gt;
&lt;P&gt;Now i want know the rate of memory bandwidth, follow you good tip, but i can´t any way, probably that is a efect of run the code.&lt;BR /&gt;There are tools that get Memory bandwidth, but I believe they read the factory value no the current rate in time execution.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Aug 2013 14:43:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969056#M2774</guid>
      <dc:creator>Isaias_Z_</dc:creator>
      <dc:date>2013-08-09T14:43:09Z</dc:date>
    </item>
    <item>
      <title>Hi again!</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969057#M2775</link>
      <description>&lt;P&gt;Hi again!&lt;BR /&gt;&lt;BR /&gt;Regarding the tests I'm doing, you told me I can&amp;nbsp; get the " speculative and out-of-order execution of non-dependent code when your CPU will be waiting for data arrival". In other words, this means that the effect of what I'm doing will give the time it more or less takes to run programs in a non efficient order while waiting for data arrival? The tools I found to measure the memory bandwidth rate (e.g. GPU-Z) give a fixed value not a dynamic value. What can I do, as you suggested earlier, to measure the memory bandwidth rate with and without the change of the CD bit in control register (CR0) when kernel is booting?&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Aug 2013 17:04:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Preventing-cache-on-Intel-i7-Sandy-Bridge/m-p/969057#M2775</guid>
      <dc:creator>Isaias_Z_</dc:creator>
      <dc:date>2013-08-09T17:04:29Z</dc:date>
    </item>
  </channel>
</rss>

