<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic If you are using MPI to run 8 in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974914#M16997</link>
    <description>&lt;P&gt;If you are using MPI to run 8 separate copies of your test simultaneously, it may not be surprising if CPU time of each copy of the test increases relative to a single copy.&lt;/P&gt;</description>
    <pubDate>Wed, 06 Nov 2013 13:18:06 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2013-11-06T13:18:06Z</dc:date>
    <item>
      <title>Compile FORTRAN using parallel MKL - how to do it in linux?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974912#M16995</link>
      <description>Folks,

Totally new to parallel programming so I need your help. I'm trying to solve a sparse&amp;nbsp;eigenvalue problem using MKL extended eigen solver. It works fine for a small testing problem in sequential mode. However, my matrix size is about 6 million and it takes forever in single core. But I got stucked on how to use the parallel capability, I know the MKL library can run parallel, correct?&lt;P&gt;&lt;/P&gt;

My system: Intel 64 linux
Software:  Intel ComposerXE 2013,  mpich installed and compiled by XE 2013

Here is what I did:
   1. Compile:    mpif90 -mkl=parallel -o test_mpi.x  test_sparse_eigen.f90
   2. Run: mpiexec -np 8 ./test_mpi.x

However, for a smaller testing solution, -np 8 used longer time than -np 1. And when I print out something, it prints out 8 times when I used -np 8 option. I know I might need to add some lines into my code to use the parallel capability, but reall has no idea where to start. Does anybody has a quick instruction and sample file? Very much approiciated and thanks in advance. Attached is my source code (FORTRAN 90).

!***********************************************
!this routine test MKL sparse eigen solver
implicit real*8 (a-h,o-z)
real*8,allocatable::a(:),b(:)
integer,allocatable::cola(:),rowa(:),colb(:),rowb(:)
real*8,allocatable::e(:), x(:,:)
integer fpm(128)
real time_begin, time_end

m0=50
emin=0.0
emax=2e7
fpm=0

open(98,file='ifort98.dat',form='unformatted')
read(98) n, na
allocate (a(na),cola(na),rowa(n+1))
read(98) (a(i),i=1,na)
read(98) (cola(i),i=1,na)
read(98) (rowa(i),i=1,n+1)
read(98) n, nb
allocate (b(nb),colb(nb),rowb(n+1))
read(98) (b(i),i=1,nb)
read(98) (colb(i),i=1,nb)
read(98) (rowb(i),i=1,n+1)
close(98)

call CPU_time(time_begin)

allocate (e(m0), x(n,m0))

call feastinit(fpm)
print*,fpm
call dfeast_scsrgv('U',n,a,rowa,cola,b,rowb,colb,fpm,epsout,loop,emin,emax,m0,e,x,m,res,info)

print*,'info=',info
print*,'m=',m
print*,'loop=',loop
print*,'epsout=',epsout

open(10,file='test.out')
do i=1,m
   write(10,*) 'mode',i,'   Freq=', sqrt(e(i))*0.5/3.1415926535897932
enddo
close(10)

deallocate (a,b,cola,rowa,colb,rowb,e,x)

call cpu_time(time_end)

print*,'Total CPU time=', time_end-time_begin
stop 
end</description>
      <pubDate>Tue, 05 Nov 2013 19:08:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974912#M16995</guid>
      <dc:creator>Letian_W_</dc:creator>
      <dc:date>2013-11-05T19:08:52Z</dc:date>
    </item>
    <item>
      <title>Hi Letian,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974913#M16996</link>
      <description>&lt;P&gt;Hi Letian,&lt;/P&gt;
&lt;P&gt;what fortran compiler are you using?&lt;/P&gt;
&lt;P&gt;As your code is not related to mpi, you can build with ifort directly.&amp;nbsp; &amp;gt; ifort -mkl&amp;nbsp; *.f90&lt;/P&gt;
&lt;P&gt;&amp;nbsp;you may try pardiso dirctly.&amp;nbsp; see the performance tips in &lt;A href="http://software.intel.com/en-us/articles/introduction-to-the-intel-mkl-extended-eigensolver"&gt;http://software.intel.com/en-us/articles/introduction-to-the-intel-mkl-extended-eigensolver&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Best Regards,&lt;/P&gt;
&lt;P&gt;Ying&lt;/P&gt;</description>
      <pubDate>Wed, 06 Nov 2013 08:55:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974913#M16996</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2013-11-06T08:55:52Z</dc:date>
    </item>
    <item>
      <title>If you are using MPI to run 8</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974914#M16997</link>
      <description>&lt;P&gt;If you are using MPI to run 8 separate copies of your test simultaneously, it may not be surprising if CPU time of each copy of the test increases relative to a single copy.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Nov 2013 13:18:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974914#M16997</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2013-11-06T13:18:06Z</dc:date>
    </item>
    <item>
      <title>Thanks, Ying/Timp.</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974915#M16998</link>
      <description>&lt;P&gt;Thanks, Ying/Timp.&lt;/P&gt;
&lt;P&gt;Now I recompiled my program per your suggestions:&lt;/P&gt;
&lt;P&gt;ifort -mkl=parallel *.f90&lt;/P&gt;
&lt;P&gt;Then I set multiple thread by "export OMP_NUM_THREADS=16" and rerun the program. Without setting OMP_NUM_THREADS, my code ran about 20 hours. Now with 16 threads, the program has been running for almost 20 hours. It seems to me that the parallel is not working. Wondering if I need to put some options during compiling. Please suggest. Thanks.&lt;/P&gt;
&lt;P&gt;Letian&lt;/P&gt;</description>
      <pubDate>Wed, 06 Nov 2013 20:03:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974915#M16998</guid>
      <dc:creator>Letian_W_</dc:creator>
      <dc:date>2013-11-06T20:03:14Z</dc:date>
    </item>
    <item>
      <title>MKL tries to choose the best</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974916#M16999</link>
      <description>&lt;P&gt;MKL tries to choose the best number of threads by default (1 per core) for those MKL functions which are built with threading.&amp;nbsp; So it will not be surprising if setting the number of threads doesn't improve it.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Nov 2013 21:06:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974916#M16999</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2013-11-06T21:06:35Z</dc:date>
    </item>
    <item>
      <title>Hi Letian,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974917#M17000</link>
      <description>&lt;P&gt;Hi Letian,&lt;/P&gt;
&lt;P&gt;If you use&amp;nbsp;top =&amp;gt; 1 command,&amp;nbsp; &amp;nbsp;how many cpue are runing?&lt;/P&gt;
&lt;P&gt;and if you'd like to see the parallel behavious of mkl&amp;nbsp;function, you may try the code in &lt;A href="http://software.intel.com/en-us/articles/intel-mkl-103-getting-started"&gt;http://software.intel.com/en-us/articles/intel-mkl-103-getting-started&lt;/A&gt;.&amp;nbsp; it is dgemm with gcc or icc.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I show the latest compiler command at &lt;A href="http://software.intel.com/comment/1768822#comment-1768822"&gt;http://software.intel.com/comment/1768822#comment-1768822&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Best Regards,&lt;/P&gt;
&lt;P&gt;Ying&lt;/P&gt;</description>
      <pubDate>Wed, 13 Nov 2013 04:09:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Compile-FORTRAN-using-parallel-MKL-how-to-do-it-in-linux/m-p/974917#M17000</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2013-11-13T04:09:07Z</dc:date>
    </item>
  </channel>
</rss>

