Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29252 Discussions

Question abot auto-parallelize with ifort 12.1

amat000
Beginner
1,073 Views
Hello,

I attached a sample code.
With ifort 12.1.1.256 for Linux, the second loop is not auto-parallelized by -parallel flag.
> ifort -O3 -parallel -par-report2 sample.f
procedure: sample
sample.f(5): (col. 8) remark: LOOP WAS AUTO-PARALLELIZED.
sample.f(6): (col. 8) remark: loop was not parallelized: insufficient inner loop.

ifort 11.1.073 can parallelize that loop.
How do I auto-parallelize sample.f with ifort 12.1.1.256?

best regards,
amat

sample.f
program sample
parameter(N=600)
real a(N,N),b(N,N),c(N,N)

do j=1,N
do i=1,N
a(i,j)=real(i)
b(i,j)=real(i+j)
c(i,j)=0.0
enddo
enddo

do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo

write(6,*) c(N,N)
stop
end
0 Kudos
9 Replies
jimdempseyatthecove
Honored Contributor III
1,073 Views
CDEC$ PARALLEL ALWAYS
do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo


I suggest you consider using OpenMP. You have more control over parallelization.

Jim Dempsey
0 Kudos
amat000
Beginner
1,073 Views
Dempsey-san,

Thank you for your reply.

You are right. OpenMP and compiler directives are usefull. But I believe the second loop should be auto-parallelized.
I would like to know why ifort 12.1.1.256 ignore the loop.

best regards,
amat
0 Kudos
Steven_L_Intel1
Employee
1,073 Views
It didn't ignore the loop. Because you asked for only parallelization reports you missed this:

Matmul Report:

<>
Loopnest at line: 13 replaced by matmul intrinsic

So what the compiler did was even better - it called an MKL-related MATMUL intrinsic which has its own parallelization code.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,073 Views
>>it called an MKL-related MATMUL intrinsic which has its own parallelization code.

Then shouldn't Matmul Report(s) be included in Parallel Report(s)?
(Same with other compiler substitutions the result in parallization.)

Jim Dempsey
0 Kudos
Steven_L_Intel1
Employee
1,073 Views
It has nothing to do directly with parallelization. It is an optimization we do even if you don't ask for autoparallel. It is listed in general optimization reports. The MKL MATMUL is very efficient even in serial form for modest sized matrices.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,073 Views
>>It has nothing to do directly with parallelization. It is an optimization we do even if you don't ask for autoparallel.

While the above statement is correct, the above also affects the code via something (MKL) that uses parallelization (indirectly). As such, its report should also be listed in the parallization report (else you receive questions about the code not parallizing some loops when auto-parallelization is enabled). Inclusion of this report int the auto-parallization would have eliminated this thread topic and others (I imagine) posted elsewhere.

There would also be a similar situation where a user wants vectorization reports and your code converts a loop to call an internal function that uses vectorization. This call should be listed in the Vectorization Report.

Jim
0 Kudos
amat000
Beginner
1,073 Views
Lionel-san and Dempsey-san,

Thank you for the reply.

I understand what Lionel-san said. Auto-parallelizer recognizes the second loop as MATMUL intrinsic function. So, we can not see any messages about the loop(function) in parallel report.
I would be happy if I could see any comments in the report for the second loop, but it seems to be difficult.

thanks a lot.
amat000
0 Kudos
Steven_L_Intel1
Employee
1,073 Views
Not difficult at all. Use

-opt-report 3

0 Kudos
amat000
Beginner
1,073 Views
Lionel-san,

I see. Thanks.

amat000
0 Kudos
Reply