<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: vsSin(..) much slower than sinf(..)?? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/vsSin-much-slower-than-sinf/m-p/928728#M13558</link>
    <description>I suppose the compiler may be able to replace your first loop by&lt;BR /&gt;value=sinf(a[9999]);&lt;BR /&gt;or may do nothing there, since you don't use value.&lt;BR /&gt;&lt;BR /&gt;You could check (e.g. by saving .asm) to see whether that loop produces an svml library call or a single evaluation, if even that.</description>
    <pubDate>Tue, 09 Nov 2004 09:13:46 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2004-11-09T09:13:46Z</dc:date>
    <item>
      <title>vsSin(..) much slower than sinf(..)??</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/vsSin-much-slower-than-sinf/m-p/928727#M13557</link>
      <description>&lt;DIV&gt;Hi! &lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;i have a little problem. I tested the two functions vsSin and sinf because i wanted to know which of these two functions is the faster one. &lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;here is my code :&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV style="BORDER-RIGHT: black 1px solid; PADDING-RIGHT: 10px; BORDER-TOP: black 1px solid; PADDING-LEFT: 10px; PADDING-BOTTOM: 10px; BORDER-LEFT: black 1px solid; PADDING-TOP: 10px; BORDER-BOTTOM: black 1px solid"&gt;&lt;SPAN class="text_smallest"&gt;Code:&lt;/SPAN&gt;&lt;PRE&gt;	float value;
	__int64 time1,time2,time3,time4;

	float a[10000];
	float b[10000];
	int n=10000;
	int mode;

  mode=VML_LA|VML_FLOAT_CONSISTENT|VML_ERRMODE_IGNORE;
  vmlSetMode(mode);

  for (int j=0;j&amp;lt;10000;j++)
     a&lt;J&gt; = (float)(rand()%8);


  QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time1);
    for (int i=0;i&amp;lt;10000;i++)
      value=sinf(a&lt;I&gt;);
  QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time2);

  QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time3);
     vsSin(n,a,b);
  QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time4);

  printf("time: %d
",time2-time1);
  printf("time: %d
",time4-time3);



&lt;/I&gt;&lt;/J&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;and now the result&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;sinf(..) took 1608 ticks (or what ever QueryPerformanceCounter returns ;) )&lt;/DIV&gt;
&lt;DIV&gt;vsSin(..) took 192344 ticks best&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;why is vsSin so slow???&lt;/DIV&gt;
&lt;DIV&gt;Did i something wrong?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;thanks for answers.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;GoreProducers&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 09 Nov 2004 03:36:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/vsSin-much-slower-than-sinf/m-p/928727#M13557</guid>
      <dc:creator>goreproducers</dc:creator>
      <dc:date>2004-11-09T03:36:40Z</dc:date>
    </item>
    <item>
      <title>Re: vsSin(..) much slower than sinf(..)??</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/vsSin-much-slower-than-sinf/m-p/928728#M13558</link>
      <description>I suppose the compiler may be able to replace your first loop by&lt;BR /&gt;value=sinf(a[9999]);&lt;BR /&gt;or may do nothing there, since you don't use value.&lt;BR /&gt;&lt;BR /&gt;You could check (e.g. by saving .asm) to see whether that loop produces an svml library call or a single evaluation, if even that.</description>
      <pubDate>Tue, 09 Nov 2004 09:13:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/vsSin-much-slower-than-sinf/m-p/928728#M13558</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2004-11-09T09:13:46Z</dc:date>
    </item>
    <item>
      <title>Re: vsSin(..) much slower than sinf(..)??</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/vsSin-much-slower-than-sinf/m-p/928729#M13559</link>
      <description>&lt;DIV&gt;Hi!&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Compiler actually does eliminate "dead code" of sinf loop, because sinf results are used nowhere.&lt;BR /&gt;Look at the generated asm:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;=============================================================&lt;BR /&gt; call DWORD PTR &lt;A href="mailto:__imp__QueryPerformanceCounter@4" target="_blank"&gt;__imp__QueryPerformanceCounter@4&lt;/A&gt;&lt;BR /&gt; &lt;BR /&gt;.B1.6: &lt;BR /&gt; lea eax, DWORD PTR [esp+16] &lt;BR /&gt; push eax&lt;BR /&gt; call DWORD PTR &lt;A href="mailto:__imp__QueryPerformanceCounter@4" target="_blank"&gt;__imp__QueryPerformanceCounter@4&lt;/A&gt;&lt;BR /&gt; &lt;BR /&gt;.B1.7: &lt;BR /&gt; lea eax, DWORD PTR [esp+24] &lt;BR /&gt; push eax &lt;BR /&gt; call DWORD PTR &lt;A href="mailto:__imp__QueryPerformanceCounter@4" target="_blank"&gt;__imp__QueryPerformanceCounter@4&lt;/A&gt;&lt;BR /&gt; &lt;BR /&gt;.B1.8: &lt;BR /&gt; lea edx, DWORD PTR [esp+40] &lt;BR /&gt; lea eax, DWORD PTR [esp+40040] &lt;BR /&gt; push eax &lt;BR /&gt; push edx &lt;BR /&gt; push 10000 &lt;BR /&gt; call _vsSin &lt;BR /&gt; &lt;BR /&gt;.B1.17: &lt;BR /&gt; add esp, 12 &lt;BR /&gt; &lt;BR /&gt;.B1.9: &lt;BR /&gt; lea eax, DWORD PTR [esp+32]&lt;BR /&gt; push eax &lt;BR /&gt; call DWORD PTR &lt;A href="mailto:__imp__QueryPerformanceCounter@4" target="_blank"&gt;__imp__QueryPerformanceCounter@4&lt;/A&gt;&lt;BR /&gt;=============================================================&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;As one can see there is no sinf loop between first two QueryPerformanceCounter calls.&lt;BR /&gt;To avoid such situation in future use one of two (or combination) methods:&lt;BR /&gt;1) compile your timing routine with optimization disabled - /Od compiler switch&lt;BR /&gt;2) emulate timed function results usage. For example, just print sinf values like:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;======================================================&lt;BR /&gt; QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time1);&lt;BR /&gt; for (int i=0;i&lt;N&gt; b&lt;I&gt;=sinf(a&lt;I&gt;);&lt;BR /&gt; QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time2);&lt;/I&gt;&lt;/I&gt;&lt;/N&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt; QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time3);&lt;BR /&gt; vsSin(n,a,b);&lt;BR /&gt; QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time4);&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt; for(i=0; i &amp;lt; n; i++)&lt;BR /&gt; {&lt;BR /&gt; printf("%f ", b&lt;I&gt;);&lt;BR /&gt; }&lt;BR /&gt;======================================================&lt;/I&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;By the way your timing results almost agree with actual VML performance (see vml notes).&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Another one hint for accuracte timing - repeat your timing procedure several times (10-20).&lt;BR /&gt;And take the best result of them.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;======================================================&lt;BR /&gt; besttime = INT_MAX;&lt;BR /&gt; curtime = 0;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt; for(int repeat = 0; repeat &amp;lt; 15; repeat++)&lt;BR /&gt; {&lt;BR /&gt; QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time3);&lt;BR /&gt; vsSin(n,a,b);&lt;BR /&gt; QueryPerformanceCounter((LARGE_INTEGER*)&amp;amp;time4);&lt;BR /&gt; curtime = time4 - time3;&lt;BR /&gt; if(curtime &amp;lt; besttime)&lt;BR /&gt; besttime = curtime; &lt;BR /&gt; }&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt; printf("time: %d
",besttime);&lt;BR /&gt;======================================================&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;This hint will help you to avoid two issues - "cold cach
e" effect and operation system impact to performance measuring.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;The best regards and good luck!&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Andrey K.&lt;BR /&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 10 Nov 2004 16:59:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/vsSin-much-slower-than-sinf/m-p/928729#M13559</guid>
      <dc:creator>Andrey_K_Intel</dc:creator>
      <dc:date>2004-11-10T16:59:51Z</dc:date>
    </item>
  </channel>
</rss>

