<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: i5-1335U P-core and E-core integer operations throughput in Mobile and Desktop Processors</title>
    <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632639#M77838</link>
    <description>&lt;P&gt;I'm going to try and wrap this up, as I need to move on to other tasks. One of my tasks was to document a procedure for evaluating hybrid CPU architectures like Arm big.LITTLE, Intel P-cores + E-cores, etc. I used Intel i5-1335U as it was the most recent design, readily available and with good drivers support in Linux.&lt;/P&gt;&lt;P&gt;An anomaly came up during testing, where single thread arithmetic integer operations throughput was on average: P-core 2.0 and E-core 3.7 instructions per cycle. Both types of cores have at least 4 separate integer units, hence IPC of around 3.5-4.0 was expected for both (taking into account the overhead for taking a branch and incrementing loop counter).&lt;/P&gt;&lt;P&gt;The same benchmark was used on various Arm CPUs for which data sheets with the exact instruction IPC figures are publicly available. The benchmark results come very close to the official IPC figures for all - load, store, integer and floating point operations. Unfortunately at the moment, the benchmark used is not available for wider distribution, hence I cannot share it with people.&lt;/P&gt;&lt;P&gt;My only conclusion is that the anomaly could be due to the following:&lt;/P&gt;&lt;P&gt;1. The IPC is deliberately throttled for a single P-core hardware thread in order to reserve the bandwidth/power for a second hardware thread. With 2 hardware threads, the combined IPC improves significantly.&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;2. There is an issue with instruction scheduling for a single P-core hardware thread. So may be a microcode update could fix it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Whatever the reasons, the issue described in this thread has nothing to do with core's operating frequency. I am also pretty confident it is not related to the benchmark used.&lt;/P&gt;</description>
    <pubDate>Sun, 22 Sep 2024 06:58:51 GMT</pubDate>
    <dc:creator>SadClouds</dc:creator>
    <dc:date>2024-09-22T06:58:51Z</dc:date>
    <item>
      <title>i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1622502#M76103</link>
      <description>&lt;P&gt;Hi, I'm performing the following test on Intel i5-1335U running Linux:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Set both P-cores and E-cores to a constant frequency of 600 MHz. No dynamic underclocking or overclocking.&amp;nbsp; For example:&lt;/P&gt;&lt;P&gt;# echo 600000 &amp;gt; /sys/devices/system/cpu/cpu11/cpufreq/scaling_min_freq&lt;/P&gt;&lt;P&gt;# echo 600000 &amp;gt; /sys/devices/system/cpu/cpu11/cpufreq/scaling_max_freq&lt;/P&gt;&lt;P&gt;# cat /sys/devices/system/cpu/cpu11/cpufreq/scaling_cur_freq&lt;BR /&gt;600000&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Perform integer addition/subtraction operations on a P-core or E-core using a single thread. Thread affinity is used to make sure the thread is running on a specific core.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm noticing that at the same clock frequency, the overall throughput of integer addition/subtraction on a P-core is around 45% lower than on an E-core. This is quite a big difference for something that is designed as a "performance" core.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anyone know why there is such a big discrepancy in performance? Are there significant differences in pipeline architecture between P-cores and E-cores for integer operations, or is this due to E-cores having 50% more Level 1 instruction cache?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Sun, 11 Aug 2024 20:05:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1622502#M76103</guid>
      <dc:creator>SadClouds</dc:creator>
      <dc:date>2024-08-11T20:05:21Z</dc:date>
    </item>
    <item>
      <title>Re: i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1622982#M76186</link>
      <description>&lt;P&gt;Linux intel_pstate driver reports the following base clock frequencies: P-cores at 800 MHz and E-cores as 600 MHz. I assume the hardware is configured for "minimum assured power" aka "cTDP-Down", as the frequencies are lower than the spec for base TDP.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I don't believe this makes difference when comparing performance of P-cores and E-cores running at the same clock frequency, as described previously.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm getting the following performance metrics for 64-bit integer arithmetic operations&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="i5-1335U_int64_tput.png" style="width: 675px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/57878i5A45E53727B26D27/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="i5-1335U_int64_tput.png" alt="i5-1335U_int64_tput.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As you can see there is something wrong with P-core instruction level parallelism for&amp;nbsp;addition, subtraction and multiplication:&lt;/P&gt;&lt;P&gt;1. Single thread&amp;nbsp; throughput is nearly half of the throughput for E-cores.&lt;/P&gt;&lt;P&gt;2. Two threads throughput on the same P-core (thread 0 affinity only for CPU 0, thread 1 affinity only for CPU 1, i.e. the same P-core with two hardware threads) is still lower than the throughput of a single thread on E-core.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone at Intel please suggest why P-cores running at the same clock frequency as E-cores, exhibit lower throughput for integer operations.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Aug 2024 18:29:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1622982#M76186</guid>
      <dc:creator>SadClouds</dc:creator>
      <dc:date>2024-08-13T18:29:59Z</dc:date>
    </item>
    <item>
      <title>Re: i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1623017#M76194</link>
      <description>&lt;P&gt;When I compare single thread 64-bit integer performance between Intel i5-1335U P-core and ARM Cortex-A72, both running at the same 600 MHz clock frequency, I get the following metrics in mega operations per second:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P data-unlink="true"&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;Intel i5-1335U | ARM Cortex-A72&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;P-core@600MHz&amp;nbsp;&amp;nbsp;| @600MHz&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;--------------------+---------------&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Add&amp;nbsp; &amp;nbsp; &amp;nbsp; 1237.21&amp;nbsp; &amp;nbsp; |&amp;nbsp; &amp;nbsp; 1062.77&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Sub&amp;nbsp; &amp;nbsp; &amp;nbsp; 1244.23&amp;nbsp; &amp;nbsp; |&amp;nbsp; &amp;nbsp; 1058.20&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Mul&amp;nbsp; &amp;nbsp; &amp;nbsp; 597.57&amp;nbsp; &amp;nbsp; &amp;nbsp;|&amp;nbsp; &amp;nbsp; 190.39&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Div&amp;nbsp; &amp;nbsp; &amp;nbsp; 59.76&amp;nbsp; &amp;nbsp; &amp;nbsp; |&amp;nbsp; &amp;nbsp; 149.14&lt;/FONT&gt;&lt;/P&gt;&lt;P data-unlink="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;P data-unlink="true"&gt;These used the Same Debian-12.2 OS and GCC-12.2.0 compiler.&lt;/P&gt;&lt;P data-unlink="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;P data-unlink="true"&gt;I still don't fully understand why Intel 13th gen P-cores are so underwhelming, however if my test methodology is correct, then Raspberry Pi 4 ARM Cortex-A72 CPU from 2016 seems to be nearly on a par (aside from multiplication) with the latest 2023 Intel mobile CPU, when forced to run at the same clock frequency. The almost X3 lower throughput for division operations looks particularly bad for Intel.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Aug 2024 21:06:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1623017#M76194</guid>
      <dc:creator>SadClouds</dc:creator>
      <dc:date>2024-08-13T21:06:58Z</dc:date>
    </item>
    <item>
      <title>Re:i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1625016#M76622</link>
      <description>&lt;P&gt;Hello &lt;A href="https://community.intel.com/t5/user/viewprofilepage/user-id/376097" rel="noopener noreferrer" target="_blank"&gt;&lt;STRONG&gt;SadClouds&lt;/STRONG&gt;&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for posting in Intel Communities.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Upon reading this information, it is best to coordinate this with our team for further investigation. I will post an update once it's available.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;JeanetteC.&lt;/P&gt;&lt;P&gt;Intel® Customer Support Technician&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 22 Aug 2024 07:25:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1625016#M76622</guid>
      <dc:creator>JeanetteC_Intel</dc:creator>
      <dc:date>2024-08-22T07:25:06Z</dc:date>
    </item>
    <item>
      <title>Re:i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632206#M77770</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;A href="https://community.intel.com/t5/user/viewprofilepage/user-id/376097" rel="noopener noreferrer" target="_blank"&gt;&lt;STRONG&gt;SadClouds&lt;/STRONG&gt;&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thank you for reaching out to us with your concerns. Upon reviewing the details of your situation, we have determined that the processor is being operated outside of its standard operating frequencies.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Our processors are designed to function optimally within a set of defined specifications, and operating them beyond these limits may lead to unpredictable behavior.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;For more information on the Base Power and Frequency Specifications Options for the i5-1335U processor visit the Datasheet, Page 98:&amp;nbsp;&lt;A href="https://www.intel.com/content/www/us/en/content-details/743844/13th-generation-intel-core-and-intel-core-14th-generation-processors-datasheet-volume-1-of-2.html" rel="noopener noreferrer" target="_blank"&gt;https://www.intel.com/content/www/us/en/content-details/743844/13th-generation-intel-core-and-intel-core-14th-generation-processors-datasheet-volume-1-of-2.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;P and E-cores build is different, their purpose and performance are different, even if they can do the same tasks.&lt;/P&gt;&lt;P&gt;I've also found some materials which can help in understanding the difference between them.&lt;/P&gt;&lt;P&gt;Efficient-core - Architecture Day 2021 | Intel Technology&lt;/P&gt;&lt;P&gt;&lt;A href="https://youtu.be/agUwkj1qTCs?si=k9TWJJdXLdOFNfCU" rel="noopener noreferrer" target="_blank"&gt;https://youtu.be/agUwkj1qTCs?si=k9TWJJdXLdOFNfCU&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://en.wikipedia.org/wiki/Gracemont_(microarchitecture)" rel="noopener noreferrer" target="_blank"&gt;https://en.wikipedia.org/wiki/Gracemont_(microarchitecture)&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Meet Performance-Core - Architecture Day 2021 | Intel Technology&lt;/P&gt;&lt;P&gt;&lt;A href="https://youtu.be/FNrOfDuP3rg?si=RZulHBUly2aY9l3b" rel="noopener noreferrer" target="_blank"&gt;https://youtu.be/FNrOfDuP3rg?si=RZulHBUly2aY9l3b&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://en.wikipedia.org/wiki/Golden_Cove#Raptor_Cove" rel="noopener noreferrer" target="_blank"&gt;https://en.wikipedia.org/wiki/Golden_Cove#Raptor_Cove&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Compare schema pictures from wiki pages to see how they're different.&lt;/P&gt;&lt;P&gt;My advice for this test would be to change frequency to 1,2 or 1,3 GHz, thanks to this both cores will be working within specs or slightly below (P-core), but without that much of a performance impact (if we look at the datasheet).&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;JeanetteC.&lt;/P&gt;&lt;P&gt;Intel® Customer Support Technician&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 19 Sep 2024 10:33:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632206#M77770</guid>
      <dc:creator>JeanetteC_Intel</dc:creator>
      <dc:date>2024-09-19T10:33:06Z</dc:date>
    </item>
    <item>
      <title>Re: Re:i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632238#M77771</link>
      <description>&lt;P&gt;Hello, thank you for the updates. Prior to asking this question, I had already looked at the exact same processors data sheet and also performed various tests with different core frequencies. I don't think that the suggested root cause of&amp;nbsp;&lt;EM&gt;"&lt;/EM&gt;&lt;SPAN&gt;&lt;EM&gt;the processor is being operated outside of its standard operating frequencies"&lt;/EM&gt; is quite correct. The relative throughput of integer operations for P-cores vs. E-cores does not seem to change with higher core frequencies. However, as you suggested, I repeated the tests at a fixed 1.3 GHz frequency.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Set CPU 0 (P-core) and CPU 11 (E-core) to 1.3 GHz:&lt;/P&gt;&lt;P&gt;&lt;FONT face="courier new,courier"&gt;# for i in 0 11&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;do&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;echo "1300000" &amp;gt; /sys/devices/system/cpu/cpu${i}/cpufreq/scaling_max_freq&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;echo "1300000" &amp;gt; /sys/devices/system/cpu/cpu${i}/cpufreq/scaling_min_freq&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;done&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;View core operating frequencies:&lt;/P&gt;&lt;P&gt;&lt;FONT face="courier new,courier"&gt;# lscpu -e&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE&amp;nbsp; &amp;nbsp; MAXMHZ&amp;nbsp; &amp;nbsp;MINMHZ&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;MHZ&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 0&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 0 0:0:0:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 4600.0000 400.0000 1300.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 1&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 0 0:0:0:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 4600.0000 400.0000&amp;nbsp; 616.1430&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 2&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 1 4:4:1:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 4600.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 3&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 1 4:4:1:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 4600.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 4&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 2 8:8:2:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 3400.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 5&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 3 9:9:2:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 3400.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 6&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 4 10:10:2:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 3400.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 7&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 5 11:11:2:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 3400.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 8&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 6 12:12:3:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 3400.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 9&amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 7 13:13:3:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 3400.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 10&amp;nbsp; &amp;nbsp;0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 8 14:14:3:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 3400.0000 400.0000&amp;nbsp; 600.0000&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; 11&amp;nbsp; &amp;nbsp;0&amp;nbsp; &amp;nbsp; &amp;nbsp; 0&amp;nbsp; &amp;nbsp; 9 15:15:3:0&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; yes 3400.0000 400.0000 1300.0000&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Obtain test results for 64-bit integer operations for each core:&lt;/P&gt;&lt;P&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;Intel i5-1335U | Intel i5-1335U&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;P-core@1300MHz | E-core@1300MHz&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;--------------------+---------------&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Add&amp;nbsp; &amp;nbsp; &amp;nbsp; 2666.42&amp;nbsp; &amp;nbsp; |&amp;nbsp; &amp;nbsp; 4876.67&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Sub&amp;nbsp; &amp;nbsp; &amp;nbsp; 2708.99&amp;nbsp; &amp;nbsp; |&amp;nbsp; &amp;nbsp; 4882.55&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Mul&amp;nbsp; &amp;nbsp; &amp;nbsp; 1295.45&amp;nbsp; &amp;nbsp; |&amp;nbsp; &amp;nbsp; 2591.26&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;Div&amp;nbsp; &amp;nbsp; &amp;nbsp; 129.58&amp;nbsp; &amp;nbsp; &amp;nbsp;|&amp;nbsp; &amp;nbsp; 215.96&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As you can see, at the same core frequency of 1.3 GHz, single thread integer operations throughput is significantly lower for P-core than it is for E-core. I think there may be a design issue with P-core integer operations pipeline.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Sep 2024 13:52:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632238#M77771</guid>
      <dc:creator>SadClouds</dc:creator>
      <dc:date>2024-09-19T13:52:06Z</dc:date>
    </item>
    <item>
      <title>Re: i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632376#M77783</link>
      <description>&lt;P&gt;If we look at the above test results for P-core and E-core running at 1.3 GHz, then we can estimate IPC (Instructions Per Cycle) throughput that this particular Intel product delivers:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;P-core:&lt;/P&gt;&lt;PRE&gt;Average MegaOps/sec for Addition and Subtraction operations = (2666+2708)/2 = 2687 MegaOps/sec&lt;BR /&gt;Average IPC = (2687x10^6 Ops/sec)/(1300x10^6 Cycles/sec) = 2.06 Ops/Cycle&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The benchmark algorithm (originally written in C) is as follows:&lt;/P&gt;&lt;PRE&gt;_Pragma ("GCC unroll 1") /* Disable loop unrolling */&lt;BR /&gt;begin loop&lt;BR /&gt;  Load MemVal-&amp;gt;RegVal;&lt;BR /&gt;  ArithOp Reg1,RegVal;&lt;BR /&gt;  ArithOp Reg2,RegVal;&lt;BR /&gt;  ArithOp Reg3,RegVal;&lt;BR /&gt;  ...&lt;BR /&gt;  ArithOp Reg16,RegVal;&lt;BR /&gt;end loop;&lt;/PRE&gt;&lt;DIV class=""&gt;Where there is a single load from memory into temporary register and then the same arithmetic operation is executed on the value in temporary register. There is no dependency between those 16 operations and the hardware could potentially execute them all in parallel.&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, the best that this particular P-core can manage is around 2 arithmetic instruction per cycle. This does not look good, considering the Raptor Cove micro architecture appears to have 5 separate integer ALUs. I think the problem may be that some of those P-core ALUs share BR (branch?) instructions in the same pipelines, where E-core has dedicated BR pipelines. Unfortunately many "industry standard" benchmarks are nearly useless when it comes to evaluating specific CPU pipelines. It would be nice if the experts at Intel could provide more technical info on why in this case the P-core IPC appears to be quite low compared to the E-core. How do people at Intel evaluate similar use cases? What benchmarks do they use? Are these benchmarks available to download from Intel which can test IPC throughput? Is there open documentation on test setup and methodology used?&lt;/P&gt;</description>
      <pubDate>Fri, 20 Sep 2024 07:27:57 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632376#M77783</guid>
      <dc:creator>SadClouds</dc:creator>
      <dc:date>2024-09-20T07:27:57Z</dc:date>
    </item>
    <item>
      <title>Re: i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632427#M77792</link>
      <description>&lt;P&gt;Hi SadClouds,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sorry to bother. Is this test's source code&amp;nbsp;available somewhere?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;best,&lt;/P&gt;</description>
      <pubDate>Fri, 20 Sep 2024 14:10:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632427#M77792</guid>
      <dc:creator>Dan0987</dc:creator>
      <dc:date>2024-09-20T14:10:28Z</dc:date>
    </item>
    <item>
      <title>Re: i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632442#M77796</link>
      <description>&lt;P&gt;Not at the moment. It is part of a larger set of closed source benchmarks used internally for evaluating system performance and scalability.&lt;/P&gt;</description>
      <pubDate>Fri, 20 Sep 2024 14:54:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632442#M77796</guid>
      <dc:creator>SadClouds</dc:creator>
      <dc:date>2024-09-20T14:54:51Z</dc:date>
    </item>
    <item>
      <title>Re: i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632639#M77838</link>
      <description>&lt;P&gt;I'm going to try and wrap this up, as I need to move on to other tasks. One of my tasks was to document a procedure for evaluating hybrid CPU architectures like Arm big.LITTLE, Intel P-cores + E-cores, etc. I used Intel i5-1335U as it was the most recent design, readily available and with good drivers support in Linux.&lt;/P&gt;&lt;P&gt;An anomaly came up during testing, where single thread arithmetic integer operations throughput was on average: P-core 2.0 and E-core 3.7 instructions per cycle. Both types of cores have at least 4 separate integer units, hence IPC of around 3.5-4.0 was expected for both (taking into account the overhead for taking a branch and incrementing loop counter).&lt;/P&gt;&lt;P&gt;The same benchmark was used on various Arm CPUs for which data sheets with the exact instruction IPC figures are publicly available. The benchmark results come very close to the official IPC figures for all - load, store, integer and floating point operations. Unfortunately at the moment, the benchmark used is not available for wider distribution, hence I cannot share it with people.&lt;/P&gt;&lt;P&gt;My only conclusion is that the anomaly could be due to the following:&lt;/P&gt;&lt;P&gt;1. The IPC is deliberately throttled for a single P-core hardware thread in order to reserve the bandwidth/power for a second hardware thread. With 2 hardware threads, the combined IPC improves significantly.&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;2. There is an issue with instruction scheduling for a single P-core hardware thread. So may be a microcode update could fix it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Whatever the reasons, the issue described in this thread has nothing to do with core's operating frequency. I am also pretty confident it is not related to the benchmark used.&lt;/P&gt;</description>
      <pubDate>Sun, 22 Sep 2024 06:58:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1632639#M77838</guid>
      <dc:creator>SadClouds</dc:creator>
      <dc:date>2024-09-22T06:58:51Z</dc:date>
    </item>
    <item>
      <title>Re:i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1635627#M78137</link>
      <description>&lt;P&gt;Hello &lt;A href="https://community.intel.com/t5/user/viewprofilepage/user-id/376097" rel="noopener noreferrer" target="_blank"&gt;&lt;STRONG&gt;SadClouds&lt;/STRONG&gt;&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Greetings from Intel Customer Support. I apologize for the delayed response.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Upon reading the information you shared, it is best to coordinate this with our team for further investigation. I will post an update once it's available.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;JeanetteC.&lt;/P&gt;&lt;P&gt;Intel® Customer Support Technician&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 07 Oct 2024 01:11:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1635627#M78137</guid>
      <dc:creator>JeanetteC_Intel</dc:creator>
      <dc:date>2024-10-07T01:11:20Z</dc:date>
    </item>
    <item>
      <title>Re:i5-1335U P-core and E-core integer operations throughput</title>
      <link>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1638394#M78393</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;A href="https://community.intel.com/t5/user/viewprofilepage/user-id/376097" rel="noopener noreferrer" target="_blank"&gt;&lt;STRONG&gt;SadClouds&lt;/STRONG&gt;&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Good day.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Kindly check your email inbox, junk, or spam folders, for this issue that you raised.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;I hope to get your email reply as well.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;JeanetteC.&lt;/P&gt;&lt;P&gt;Intel® Customer Support Technician&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 21 Oct 2024 11:23:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Mobile-and-Desktop-Processors/i5-1335U-P-core-and-E-core-integer-operations-throughput/m-p/1638394#M78393</guid>
      <dc:creator>JeanetteC_Intel</dc:creator>
      <dc:date>2024-10-21T11:23:19Z</dc:date>
    </item>
  </channel>
</rss>

