Question abot auto-parallelize with ifort 12.1

amat000 · ‎11-30-2011

Hello,

I attached a sample code.
With ifort 12.1.1.256 for Linux, the second loop is not auto-parallelized by -parallel flag.
> ifort -O3 -parallel -par-report2 sample.f
procedure: sample
sample.f(5): (col. 8) remark: LOOP WAS AUTO-PARALLELIZED.
sample.f(6): (col. 8) remark: loop was not parallelized: insufficient inner loop.

ifort 11.1.073 can parallelize that loop.
How do I auto-parallelize sample.f with ifort 12.1.1.256?

best regards,
amat

sample.f
program sample
parameter(N=600)
real a(N,N),b(N,N),c(N,N)

do j=1,N
do i=1,N
a(i,j)=real(i)
b(i,j)=real(i+j)
c(i,j)=0.0
enddo
enddo

do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo

write(6,*) c(N,N)
stop
end

jimdempseyatthecove · ‎11-30-2011

CDEC$ PARALLEL ALWAYS
do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo

I suggest you consider using OpenMP. You have more control over parallelization.

Jim Dempsey

amat000 · ‎11-30-2011

Dempsey-san,

Thank you for your reply.

You are right. OpenMP and compiler directives are usefull. But I believe the second loop should be auto-parallelized.
I would like to know why ifort 12.1.1.256 ignore the loop.

best regards,
amat

Steven_L_Intel1 · ‎12-01-2011

It didn't ignore the loop. Because you asked for only parallelization reports you missed this:

Matmul Report:

<>
Loopnest at line: 13 replaced by matmul intrinsic

So what the compiler did was even better - it called an MKL-related MATMUL intrinsic which has its own parallelization code.

jimdempseyatthecove · ‎12-01-2011

>>it called an MKL-related MATMUL intrinsic which has its own parallelization code.

Then shouldn't Matmul Report(s) be included in Parallel Report(s)?
(Same with other compiler substitutions the result in parallization.)

Jim Dempsey

Steven_L_Intel1 · ‎12-01-2011

It has nothing to do directly with parallelization. It is an optimization we do even if you don't ask for autoparallel. It is listed in general optimization reports. The MKL MATMUL is very efficient even in serial form for modest sized matrices.

jimdempseyatthecove · ‎12-02-2011

>>It has nothing to do directly with parallelization. It is an optimization we do even if you don't ask for autoparallel.

While the above statement is correct, the above also affects the code via something (MKL) that uses parallelization (indirectly). As such, its report should also be listed in the parallization report (else you receive questions about the code not parallizing some loops when auto-parallelization is enabled). Inclusion of this report int the auto-parallization would have eliminated this thread topic and others (I imagine) posted elsewhere.

There would also be a similar situation where a user wants vectorization reports and your code converts a loop to call an internal function that uses vectorization. This call should be listed in the Vectorization Report.

Jim

amat000 · ‎12-05-2011

Lionel-san and Dempsey-san,

Thank you for the reply.

I understand what Lionel-san said. Auto-parallelizer recognizes the second loop as MATMUL intrinsic function. So, we can not see any messages about the loop(function) in parallel report.
I would be happy if I could see any comments in the report for the second loop, but it seems to be difficult.

thanks a lot.
amat000

Steven_L_Intel1 · ‎12-06-2011

Not difficult at all. Use

-opt-report 3

amat000 · ‎12-07-2011

Lionel-san,

I see. Thanks.

amat000