<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Sorry, in last compile I have in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/speedup-problem-using-openMP-in-intel-fortran/m-p/1048648#M6791</link>
    <description>&lt;P&gt;Sorry, in last compile I have changed&amp;nbsp; !$OMP to !z$OMP for testing the program using openMP please remove z from !z$OMP in sample.&lt;/P&gt;</description>
    <pubDate>Mon, 12 Jan 2015 20:34:49 GMT</pubDate>
    <dc:creator>bohluly</dc:creator>
    <dc:date>2015-01-12T20:34:49Z</dc:date>
    <item>
      <title>speedup problem using openMP in intel fortran</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/speedup-problem-using-openMP-in-intel-fortran/m-p/1048647#M6790</link>
      <description>&lt;P&gt;Dear all,&lt;/P&gt;

&lt;P&gt;I have developed&amp;nbsp; a program and unfortunately I have speedup problem in it. My program is so big so I have tried to write a sample similar to my program, fortunately this simple program has a same problem with my program.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I need other experiences and your help if it is possible.&lt;/P&gt;

&lt;P&gt;Thanks,&lt;/P&gt;

&lt;P&gt;I am using VS2010 and Intel FORTRAN XE 2011&lt;/P&gt;

&lt;P&gt;Program:&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; TYPE var&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; REAL(8),POINTER :: A, B, C&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; END TYPE var&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; REAL(8),POINTER :: A(:), B(:), C(:)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; TYPE(var),POINTER&amp;nbsp; :: vars(:) &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; TYPE(var),POINTER :: varOMP&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; REAL*8&amp;nbsp; t1,t2 ,ai,bi,ci,di,ei,fi &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; INTEGER(4) c1,c2&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; INTEGER N, CHUNKSIZE, I, id, f , l&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; PARAMETER (N=200)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; PARAMETER (CHUNKSIZE=10)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Allocate (A(N), B(N), C(N),vars(N))&lt;/P&gt;

&lt;P&gt;!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; initializations&lt;BR /&gt;
	&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DO I = 1, N&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; A(N)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; =&amp;nbsp;&amp;nbsp; I * 1.0&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; B(N)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; =&amp;nbsp;&amp;nbsp; A(N)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; vars(I)%A =&amp;gt;&amp;nbsp; A(N)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; vars(I)%B =&amp;gt;&amp;nbsp; B(N)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; vars(I)%C =&amp;gt;&amp;nbsp; C(N)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; vars(I)%A = 0.51&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; vars(I)%B = 0.45&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ENDDO&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CALL SYSTEM_CLOCK(c1)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Do Itration=1,1000000&lt;BR /&gt;
	!z$OMP PARALLEL PRIVATE(I,varOMP,ai,bi,ci,ei ,di,fi )&lt;BR /&gt;
	!z$OMP DO SCHEDULE(STATIC,CHUNKSIZE)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DO I = 1, N&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; varOMP =&amp;gt; vars(I)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ai = varOMP%A&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; bi = varOMP%B&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; di = ai*2.2 + bi * 2.0&amp;nbsp; + ai*2.1&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ei = ai*2.1 + bi * 2.3&amp;nbsp; + ai*2.15&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; fi = di * ( ai + bi )*2.1 + ei * ( di + bi )*2.1&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ci = bi*2.1 + ai*2.0 + di*2.0 + ei*2.0 + fi*2.0&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; varOMP%C = ci&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ENDDO&lt;BR /&gt;
	!z$OMP END DO &amp;nbsp;&lt;BR /&gt;
	!z$OMP END PARALLEL&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ENDDO&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CALL SYSTEM_CLOCK(c2)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; WRITE(*,*) c2-c1&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; STOP&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; END&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jan 2015 20:03:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/speedup-problem-using-openMP-in-intel-fortran/m-p/1048647#M6790</guid>
      <dc:creator>bohluly</dc:creator>
      <dc:date>2015-01-12T20:03:59Z</dc:date>
    </item>
    <item>
      <title>Sorry, in last compile I have</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/speedup-problem-using-openMP-in-intel-fortran/m-p/1048648#M6791</link>
      <description>&lt;P&gt;Sorry, in last compile I have changed&amp;nbsp; !$OMP to !z$OMP for testing the program using openMP please remove z from !z$OMP in sample.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jan 2015 20:34:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/speedup-problem-using-openMP-in-intel-fortran/m-p/1048648#M6791</guid>
      <dc:creator>bohluly</dc:creator>
      <dc:date>2015-01-12T20:34:49Z</dc:date>
    </item>
    <item>
      <title>To see an advantage for</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/speedup-problem-using-openMP-in-intel-fortran/m-p/1048649#M6792</link>
      <description>&lt;P&gt;To see an advantage for threading you will need an outer parallel loop with a count such as 1000 with an inner vectorizable loop of at least &amp;nbsp;the size shown. System_clock works better with 64 bit arguments.mm&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;
threading    can't compensate for inefficient data access.
If multiple threads write to the same  cache line (false sharing) it will perform poorly.</description>
      <pubDate>Tue, 13 Jan 2015 00:13:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/speedup-problem-using-openMP-in-intel-fortran/m-p/1048649#M6792</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2015-01-13T00:13:00Z</dc:date>
    </item>
  </channel>
</rss>

