- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I attached a sample code.
With ifort 12.1.1.256 for Linux, the second loop is not auto-parallelized by -parallel flag.
> ifort -O3 -parallel -par-report2 sample.f
procedure: sample
sample.f(5): (col. 8) remark: LOOP WAS AUTO-PARALLELIZED.
sample.f(6): (col. 8) remark: loop was not parallelized: insufficient inner loop.
ifort 11.1.073 can parallelize that loop.
How do I auto-parallelize sample.f with ifort 12.1.1.256?
best regards,
amat
sample.f
program sample
parameter(N=600)
real a(N,N),b(N,N),c(N,N)
do j=1,N
do i=1,N
a(i,j)=real(i)
b(i,j)=real(i+j)
c(i,j)=0.0
enddo
enddo
do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo
write(6,*) c(N,N)
stop
end
I attached a sample code.
With ifort 12.1.1.256 for Linux, the second loop is not auto-parallelized by -parallel flag.
> ifort -O3 -parallel -par-report2 sample.f
procedure: sample
sample.f(5): (col. 8) remark: LOOP WAS AUTO-PARALLELIZED.
sample.f(6): (col. 8) remark: loop was not parallelized: insufficient inner loop.
ifort 11.1.073 can parallelize that loop.
How do I auto-parallelize sample.f with ifort 12.1.1.256?
best regards,
amat
sample.f
program sample
parameter(N=600)
real a(N,N),b(N,N),c(N,N)
do j=1,N
do i=1,N
a(i,j)=real(i)
b(i,j)=real(i+j)
c(i,j)=0.0
enddo
enddo
do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo
write(6,*) c(N,N)
stop
end
Link Copied
9 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CDEC$ PARALLEL ALWAYS
do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo
I suggest you consider using OpenMP. You have more control over parallelization.
Jim Dempsey
do j=1,N
do k=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo
I suggest you consider using OpenMP. You have more control over parallelization.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dempsey-san,
Thank you for your reply.
You are right. OpenMP and compiler directives are usefull. But I believe the second loop should be auto-parallelized.
I would like to know why ifort 12.1.1.256 ignore the loop.
best regards,
amat
Thank you for your reply.
You are right. OpenMP and compiler directives are usefull. But I believe the second loop should be auto-parallelized.
I would like to know why ifort 12.1.1.256 ignore the loop.
best regards,
amat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It didn't ignore the loop. Because you asked for only parallelization reports you missed this:
Matmul Report:
<>
Loopnest at line: 13 replaced by matmul intrinsic
So what the compiler did was even better - it called an MKL-related MATMUL intrinsic which has its own parallelization code.>
Matmul Report:
<>
Loopnest at line: 13 replaced by matmul intrinsic
So what the compiler did was even better - it called an MKL-related MATMUL intrinsic which has its own parallelization code.>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>it called an MKL-related MATMUL intrinsic which has its own parallelization code.
Then shouldn't Matmul Report(s) be included in Parallel Report(s)?
(Same with other compiler substitutions the result in parallization.)
Jim Dempsey
Then shouldn't Matmul Report(s) be included in Parallel Report(s)?
(Same with other compiler substitutions the result in parallization.)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It has nothing to do directly with parallelization. It is an optimization we do even if you don't ask for autoparallel. It is listed in general optimization reports. The MKL MATMUL is very efficient even in serial form for modest sized matrices.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>It has nothing to do directly with parallelization. It is an optimization we do even if you don't ask for autoparallel.
While the above statement is correct, the above also affects the code via something (MKL) that uses parallelization (indirectly). As such, its report should also be listed in the parallization report (else you receive questions about the code not parallizing some loops when auto-parallelization is enabled). Inclusion of this report int the auto-parallization would have eliminated this thread topic and others (I imagine) posted elsewhere.
There would also be a similar situation where a user wants vectorization reports and your code converts a loop to call an internal function that uses vectorization. This call should be listed in the Vectorization Report.
Jim
While the above statement is correct, the above also affects the code via something (MKL) that uses parallelization (indirectly). As such, its report should also be listed in the parallization report (else you receive questions about the code not parallizing some loops when auto-parallelization is enabled). Inclusion of this report int the auto-parallization would have eliminated this thread topic and others (I imagine) posted elsewhere.
There would also be a similar situation where a user wants vectorization reports and your code converts a loop to call an internal function that uses vectorization. This call should be listed in the Vectorization Report.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Lionel-san and Dempsey-san,
Thank you for the reply.
I understand what Lionel-san said. Auto-parallelizer recognizes the second loop as MATMUL intrinsic function. So, we can not see any messages about the loop(function) in parallel report.
I would be happy if I could see any comments in the report for the second loop, but it seems to be difficult.
thanks a lot.
amat000
Thank you for the reply.
I understand what Lionel-san said. Auto-parallelizer recognizes the second loop as MATMUL intrinsic function. So, we can not see any messages about the loop(function) in parallel report.
I would be happy if I could see any comments in the report for the second loop, but it seems to be difficult.
thanks a lot.
amat000
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Not difficult at all. Use
-opt-report 3
-opt-report 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Lionel-san,
I see. Thanks.
amat000
I see. Thanks.
amat000

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page