<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Thanks John, in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117169#M6135</link>
    <description>&lt;P&gt;Thanks John,&lt;/P&gt;

&lt;P&gt;I changed the HyperThreading mode and it works.&amp;nbsp; &amp;nbsp;iA=64 and iB=2048.&lt;/P&gt;

&lt;P&gt;How about without disabling the HyperThreading? It can work with HyperThreading? When I set 0,2,4and 6. cores I saw"Affinity error" for 4. and 6. cores on screen. Can I run this code with HyperThreading(enable)?&lt;/P&gt;</description>
    <pubDate>Mon, 15 Feb 2016 15:51:23 GMT</pubDate>
    <dc:creator>atilla_k_</dc:creator>
    <dc:date>2016-02-15T15:51:23Z</dc:date>
    <item>
      <title>Multicore SMP system task time measurements error</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117167#M6133</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I have an i7-4700EQ processor. I want to use 4 cores with parallel.&amp;nbsp;I compiled below code and run it. With only 1 core time measurement was 7198.200000us. But with 4 cores, i&amp;nbsp;saw 18290.221667us for each cores. How can it possible? I should have seen about 7198us, right? Because I used independent tasks and independent memories.&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;build specs:&lt;/STRONG&gt; CC_ARCH_SPEC = -march=core2 -nostdlib -fno-builtin -fno-defer-pop -m64 -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mavx2 -fno-implicit-fp&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;code;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;void MultiCoresExample(int iA, int iB, int affin);&lt;BR /&gt;
	void TempMultiCoreCopy(int iA, int iB, int affin);&lt;/P&gt;

&lt;P&gt;double dtime1[4], dtime2[4];&lt;BR /&gt;
	typedef struct&lt;BR /&gt;
	{&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; float *vInput[4];&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; float *vOutput[4];&lt;BR /&gt;
	}tempStruct;&lt;BR /&gt;
	tempStruct tmpStr;&lt;/P&gt;

&lt;P&gt;void MultiCoresExample(int iA, int iB, int affin)&lt;BR /&gt;
	{&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; TASK_ID&amp;nbsp; tids[4];&amp;nbsp; /* some task IDs */&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; char taskName[32];&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; int cpuIx[] = {0,1,2,3};&amp;nbsp; /* core ID's*/&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; int i, j;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; cpuset_t affinity;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; float *f0, *f1, *f2, *f3, *f4, *f5, *f6, *f7;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; float *fIn[4];&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; float *fOut[4];&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; f0 = memalign(128, iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; f1 = memalign(128, iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; f2 = memalign(128, iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; f3 = memalign(128, iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; f4 = memalign(128, iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; f5 = memalign(128, iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; f6 = memalign(128, iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; f7 = memalign(128, iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;tmpStr.vInput[0]&amp;nbsp; = f0;&lt;BR /&gt;
	&amp;nbsp;tmpStr.vInput[1]&amp;nbsp; = f1;&lt;BR /&gt;
	&amp;nbsp;tmpStr.vInput[2]&amp;nbsp; = f2;&lt;BR /&gt;
	&amp;nbsp;tmpStr.vInput[3]&amp;nbsp; = f3;&lt;BR /&gt;
	&amp;nbsp;tmpStr.vOutput[0] = f4;&lt;BR /&gt;
	&amp;nbsp;tmpStr.vOutput[1] = f5;&lt;BR /&gt;
	&amp;nbsp;tmpStr.vOutput[2] = f6;&lt;BR /&gt;
	&amp;nbsp;tmpStr.vOutput[3] = f7;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; /******* init ***************/&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; for(i=0; i&amp;lt;affin; i++)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; for(j=0; j&amp;lt;iA*iB; j++)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;f0&lt;J&gt;= (i+1)*j/100.;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;f1&lt;J&gt;= (i+1)*j/70.;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;f2&lt;J&gt;= (i+1)*j/40.;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;f3&lt;J&gt;= (i+1)*j/20.;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; /****************************/&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Cores are setting...\n");&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; for(i=0; i&amp;lt;affin; i++)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CPUSET_ZERO (affinity);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CPUSET_SET(affinity, cpuIx&lt;I&gt;);&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; sprintf(taskName, "t%s%d", "task", i);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;tids&lt;I&gt; = taskCreate(taskName, 120, TASK_OPTIONS, 65536, (FUNCPTR)TempMultiCoreCopy, iA, iB, affin, 0,0,0,0, 0, 0, 0);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Task created:0x%08x\n", tids&lt;I&gt;);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (tids&lt;I&gt; == NULL)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; /*return (ERROR);*/&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Task create error:0x%08x\n", tids&lt;I&gt;);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if(affin != -1)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; /* Clear the affinity CPU set and set index for CPU */&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (taskCpuAffinitySet(tids&lt;I&gt;, affinity) == ERROR)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; /* Either CPUs are not enabled or we are in UP mode */&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Affinity error \n");&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; taskDelete(tids&lt;I&gt;);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; /*return (ERROR);*/&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; taskDelay(sysClkRateGet()/10);&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; taskCpuAffinityGet(tids&lt;I&gt;, &amp;amp;affinity);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Task Affinity:%d\n", affinity);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;for(i=0; i&amp;lt;affin; i++)&lt;BR /&gt;
	&amp;nbsp;{&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;taskActivate(tids&lt;I&gt;);&lt;BR /&gt;
	&amp;nbsp;}&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; taskDelay(sysClkRateGet()* 4); /* for finish all cores.*/&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; for(i=0; i&amp;lt;affin; i++)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("\nStartTime[%d]=%f&amp;nbsp; FinishTime[%d]=%f&amp;nbsp; ExecutionTimeForCore[%d]=%f us\n", i, dtime1&lt;I&gt;, i, dtime2&lt;I&gt;, i, (dtime2&lt;I&gt;-dtime1&lt;I&gt;));&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; for(i=0; i&amp;lt;affin; i++)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; taskDelete(tids&lt;I&gt;);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	}&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/P&gt;

&lt;P&gt;void TempMultiCoreCopy(int iA, int iB, int affin)&lt;BR /&gt;
	{&lt;BR /&gt;
	&amp;nbsp;int kk;&lt;BR /&gt;
	&amp;nbsp;int iCpuId = vxCpuIdGet();&lt;BR /&gt;
	&amp;nbsp;dtime1[iCpuId] = getTimeDouble(2);&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;for(kk=0; kk&amp;lt;1000; kk++)&lt;BR /&gt;
	&amp;nbsp;{&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;memcpy(tmpStr.vOutput[iCpuId], tmpStr.vInput[iCpuId], iA*iB*4);&lt;BR /&gt;
	&amp;nbsp;}&lt;BR /&gt;
	&amp;nbsp;dtime2[iCpuId] = getTimeDouble(2);&lt;BR /&gt;
	}&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;screen;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;sp MultiCoresExample,16,2048,1&lt;BR /&gt;
	Task spawned: id = 0xffff80000efd1510, name = t1&lt;BR /&gt;
	value = -140737236888304 = 0xffff80000efd1510&lt;BR /&gt;
	A-&amp;gt;Cores are setting...&lt;BR /&gt;
	Task created:0x0efe2020&lt;BR /&gt;
	Task Affinity:1&lt;BR /&gt;
	StartTime[0]=218186962.911667&amp;nbsp; FinishTime[0]=218194161.111667&amp;nbsp; ExecutionTimeForCore[0]=7198.200000 us&lt;/P&gt;

&lt;P&gt;sp MultiCoresExample,16,2048,2&lt;BR /&gt;
	Task spawned: id = 0xffff80000efd1510, name = t2&lt;BR /&gt;
	value = -140737236888304 = 0xffff80000efd1510&lt;BR /&gt;
	A-&amp;gt;Cores are setting...&lt;BR /&gt;
	Task created:0x0efe2020&lt;BR /&gt;
	Task Affinity:1&lt;BR /&gt;
	Task created:0x0f1ea810&lt;BR /&gt;
	Task Affinity:2&lt;BR /&gt;
	StartTime[0]=264755500.995000&amp;nbsp; FinishTime[0]=264773712.746667&amp;nbsp; ExecutionTimeForCore[0]=18211.751667 us&lt;BR /&gt;
	StartTime[1]=264755514.550000&amp;nbsp; FinishTime[1]=264773614.643333&amp;nbsp; ExecutionTimeForCore[1]=18100.093333 us&lt;/P&gt;

&lt;P&gt;sp MultiCoresExample,16,2048,3&lt;BR /&gt;
	Task spawned: id = 0xffff80000efd1510, name = t3&lt;BR /&gt;
	value = -140737236888304 = 0xffff80000efd1510&lt;BR /&gt;
	A-&amp;gt;Cores are setting...&lt;BR /&gt;
	Task created:0x0efe2020&lt;BR /&gt;
	Task Affinity:1&lt;BR /&gt;
	Task created:0x0f1ea810&lt;BR /&gt;
	Task Affinity:2&lt;BR /&gt;
	Task created:0x0efe2510&lt;BR /&gt;
	Task Affinity:4&lt;BR /&gt;
	StartTime[0]=288507258.976667&amp;nbsp; FinishTime[0]=288525447.206667&amp;nbsp; ExecutionTimeForCore[0]=18188.230000 us&lt;BR /&gt;
	StartTime[1]=288507271.261667&amp;nbsp; FinishTime[1]=288525387.871667&amp;nbsp; ExecutionTimeForCore[1]=18116.610000 us&lt;BR /&gt;
	StartTime[2]=288507259.561667&amp;nbsp; FinishTime[2]=288514408.870000&amp;nbsp; ExecutionTimeForCore[2]=7149.308333 us&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	sp MultiCoresExample,16,2048,4&lt;BR /&gt;
	Task spawned: id = 0xffff80000efd1510, name = t4&lt;BR /&gt;
	value = -140737236888304 = 0xffff80000efd1510&lt;BR /&gt;
	A-&amp;gt;Cores are setting...&lt;BR /&gt;
	Task created:0x0efe2020&lt;BR /&gt;
	Task Affinity:1&lt;BR /&gt;
	Task created:0x0f1ea810&lt;BR /&gt;
	Task Affinity:2&lt;BR /&gt;
	Task created:0x0f413610&lt;BR /&gt;
	Task Affinity:4&lt;BR /&gt;
	Task created:0x0f413b00&lt;BR /&gt;
	Task Affinity:8&lt;BR /&gt;
	StartTime[0]=307985065.768333&amp;nbsp; FinishTime[0]=308003355.990000&amp;nbsp; ExecutionTimeForCore[0]=18290.221667 us&lt;BR /&gt;
	StartTime[1]=307985078.606667&amp;nbsp; FinishTime[1]=308003284.243333&amp;nbsp; ExecutionTimeForCore[1]=18205.636667 us&lt;BR /&gt;
	StartTime[2]=307985064.923333&amp;nbsp; FinishTime[2]=308003229.746667&amp;nbsp; ExecutionTimeForCore[2]=18164.823333 us&lt;BR /&gt;
	StartTime[3]=307985066.711667&amp;nbsp; FinishTime[3]=308003220.956667&amp;nbsp; ExecutionTimeForCore[3]=18154.245000 us&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Feb 2016 12:12:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117167#M6133</guid>
      <dc:creator>atilla_k_</dc:creator>
      <dc:date>2016-02-04T12:12:36Z</dc:date>
    </item>
    <item>
      <title>What are the values of iA and</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117168#M6134</link>
      <description>&lt;P&gt;What are the values of iA and iB?&amp;nbsp;&amp;nbsp;&amp;nbsp; It is not possible to figure out which parts of the memory hierarchy are being used if the sizes of the arrays are not known.&lt;/P&gt;

&lt;P&gt;Your processor supports HyperThreading.&amp;nbsp; If HyperThreading is enabled, the system might map logical processors 0,1,2,3 to different physical cores, or it might map logical processors 0,2,4,6 to different physical cores.&lt;/P&gt;

&lt;P&gt;Your 3-thread result suggests that you do have HyperThreading enabled and that logical processors 0,1 are mapped to physical core 0, 2,3 are mapped to physical core 1, 4,5 are mapped to physical core 2, and 6,7 are mapped to physical core 3.&amp;nbsp;&amp;nbsp; So in the 3-thread case, threads 0 and 1 are sharing physical core 0 (and therefore running slowly), while thread 2 is running by itself on physical core 1 (and running at full speed).&lt;/P&gt;</description>
      <pubDate>Thu, 04 Feb 2016 16:05:57 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117168#M6134</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2016-02-04T16:05:57Z</dc:date>
    </item>
    <item>
      <title>Thanks John,</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117169#M6135</link>
      <description>&lt;P&gt;Thanks John,&lt;/P&gt;

&lt;P&gt;I changed the HyperThreading mode and it works.&amp;nbsp; &amp;nbsp;iA=64 and iB=2048.&lt;/P&gt;

&lt;P&gt;How about without disabling the HyperThreading? It can work with HyperThreading? When I set 0,2,4and 6. cores I saw"Affinity error" for 4. and 6. cores on screen. Can I run this code with HyperThreading(enable)?&lt;/P&gt;</description>
      <pubDate>Mon, 15 Feb 2016 15:51:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117169#M6135</guid>
      <dc:creator>atilla_k_</dc:creator>
      <dc:date>2016-02-15T15:51:23Z</dc:date>
    </item>
    <item>
      <title>Based on the output of your</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117170#M6136</link>
      <description>&lt;P&gt;Based on the output of your first run, it looks like you should try changing&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;int cpuIx[] = {0,1,2,3};  /* core ID's*/&lt;/PRE&gt;

&lt;P&gt;to&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;int cpuIx[] = {0,2,4,6};  /* core ID's*/&lt;/PRE&gt;

&lt;P&gt;When HyperThreading is enabled, this should place one thread on each physical core.&lt;/P&gt;

&lt;P&gt;If HyperThreading is disabled, then this won't work, since the available cores will be [0,1,2,3], so you will need the code to be able to compensate.&lt;/P&gt;

&lt;P&gt;It gets trickier if you need to programmatically determine whether or not HyperThreading is enabled and how the "logical processors" are mapped to the physical cores, and I don't know how to attempt to do this on VxWorks.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Feb 2016 21:16:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117170#M6136</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2016-02-15T21:16:06Z</dc:date>
    </item>
    <item>
      <title>Thanks John,</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117171#M6137</link>
      <description>&lt;P&gt;Thanks John,&lt;/P&gt;

&lt;P&gt;I tried &lt;FONT face="Courier New"&gt;cpuIx[] = {0,2,4,6};&amp;nbsp; &lt;/FONT&gt;but it didnt work correctly.(When HyperThreading is enabled)&amp;nbsp; When I set 4 and 6 cpuId, I took "Affinity error" from my code. I changed these Ids and tried all Ids but it didnt work. As soon as, I changed the mode of HyperThreading(disable), it worked right with &lt;FONT face="Courier New"&gt;cpuIx[] = {0,1,2,3};.&lt;/FONT&gt;&lt;/P&gt;

&lt;P&gt;I didnt understand why it didnt work. Also I tried mapping loggical cores to physical cores.&lt;/P&gt;

&lt;P&gt;Other problem is data size;&lt;/P&gt;

&lt;P&gt;When iA=16 and iB=2048&amp;nbsp;so data size&amp;nbsp;is equal&amp;nbsp;&amp;nbsp;iA*iB*sizeOf(float), it works&lt;STRONG&gt; parallel &lt;/STRONG&gt;with all cores&amp;nbsp;(HyperThreading is enabled )&lt;/P&gt;

&lt;P&gt;When iA=64 and iB=2048&amp;nbsp;so data size&amp;nbsp;is equal&amp;nbsp;&amp;nbsp;iA*iB*sizeOf(float), it does &lt;STRONG&gt;not work parallel &lt;/STRONG&gt;with all cores&amp;nbsp;(HyperThreading is enabled )&lt;/P&gt;

&lt;P&gt;Do you have any idea ?&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 22 Feb 2016 08:13:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117171#M6137</guid>
      <dc:creator>atilla_k_</dc:creator>
      <dc:date>2016-02-22T08:13:49Z</dc:date>
    </item>
    <item>
      <title>I don't have any experience</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117172#M6138</link>
      <description>&lt;P&gt;I don't have any experience with VxWorks, so I can't really speculate on what is going on with the affinity calls.&lt;/P&gt;

&lt;P&gt;I just noticed that your compilation options include several flags that are specific to generating code for running inside the kernel, but the rest of the code does not look like it is set up as a kernel module.&amp;nbsp; This could be the cause of some of the troubles?&lt;/P&gt;</description>
      <pubDate>Mon, 22 Feb 2016 14:29:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Multicore-SMP-system-task-time-measurements-error/m-p/1117172#M6138</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2016-02-22T14:29:38Z</dc:date>
    </item>
  </channel>
</rss>

