Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29253 ディスカッション

Question abot auto-parallelize with ifort 12.1

amat000
ビギナー
1,074件の閲覧回数
Hello,

I attached a sample code.
With ifort 12.1.1.256 for Linux, the second loop is not auto-parallelized by -parallel flag.
> ifort -O3 -parallel -par-report2 sample.f
procedure: sample
sample.f(5): (col. 8) remark: LOOP WAS AUTO-PARALLELIZED.
sample.f(6): (col. 8) remark: loop was not parallelized: insufficient inner loop.

ifort 11.1.073 can parallelize that loop.
How do I auto-parallelize sample.f with ifort 12.1.1.256?

best regards,
amat

sample.f
program sample
parameter(N=600)
real a(N,N),b(N,N),c(N,N)

do j=1,N
do i=1,N
a(i,j)=real(i)
b(i,j)=real(i+j)
c(i,j)=0.0
enddo
enddo

do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo

write(6,*) c(N,N)
stop
end
0 件の賞賛
9 返答(返信)
jimdempseyatthecove
名誉コントリビューター III
1,074件の閲覧回数
CDEC$ PARALLEL ALWAYS
do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo


I suggest you consider using OpenMP. You have more control over parallelization.

Jim Dempsey
amat000
ビギナー
1,074件の閲覧回数
Dempsey-san,

Thank you for your reply.

You are right. OpenMP and compiler directives are usefull. But I believe the second loop should be auto-parallelized.
I would like to know why ifort 12.1.1.256 ignore the loop.

best regards,
amat
Steven_L_Intel1
従業員
1,074件の閲覧回数
It didn't ignore the loop. Because you asked for only parallelization reports you missed this:

Matmul Report:

<>
Loopnest at line: 13 replaced by matmul intrinsic

So what the compiler did was even better - it called an MKL-related MATMUL intrinsic which has its own parallelization code.
jimdempseyatthecove
名誉コントリビューター III
1,074件の閲覧回数
>>it called an MKL-related MATMUL intrinsic which has its own parallelization code.

Then shouldn't Matmul Report(s) be included in Parallel Report(s)?
(Same with other compiler substitutions the result in parallization.)

Jim Dempsey
Steven_L_Intel1
従業員
1,074件の閲覧回数
It has nothing to do directly with parallelization. It is an optimization we do even if you don't ask for autoparallel. It is listed in general optimization reports. The MKL MATMUL is very efficient even in serial form for modest sized matrices.
jimdempseyatthecove
名誉コントリビューター III
1,074件の閲覧回数
>>It has nothing to do directly with parallelization. It is an optimization we do even if you don't ask for autoparallel.

While the above statement is correct, the above also affects the code via something (MKL) that uses parallelization (indirectly). As such, its report should also be listed in the parallization report (else you receive questions about the code not parallizing some loops when auto-parallelization is enabled). Inclusion of this report int the auto-parallization would have eliminated this thread topic and others (I imagine) posted elsewhere.

There would also be a similar situation where a user wants vectorization reports and your code converts a loop to call an internal function that uses vectorization. This call should be listed in the Vectorization Report.

Jim
amat000
ビギナー
1,074件の閲覧回数
Lionel-san and Dempsey-san,

Thank you for the reply.

I understand what Lionel-san said. Auto-parallelizer recognizes the second loop as MATMUL intrinsic function. So, we can not see any messages about the loop(function) in parallel report.
I would be happy if I could see any comments in the report for the second loop, but it seems to be difficult.

thanks a lot.
amat000
Steven_L_Intel1
従業員
1,074件の閲覧回数
Not difficult at all. Use

-opt-report 3

amat000
ビギナー
1,074件の閲覧回数
Lionel-san,

I see. Thanks.

amat000
返信