- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[bash]! This program should write sin(sin(1)) on the screen which is 0.745624 ! ! But, when compiled with ! ifort -ipo -c module.f90 -o module.o ! ifort -ipo -c main.f90 -o main.o ! ifort -parallel module.o main.o -o main ! with ifort (IFORT) 12.0.3 20110309 on Linux, and Mac, it answers 1. ! ! It runs nicely with ifort (IFORT) 11.1 20091130 on Linux. program main use shared implicit none integer, parameter :: dp = 8 ! If you replace it with n = 22, it's going to give the good result integer, parameter :: n = 23 real(dp), dimension(n) :: tab integer :: j ! If you comment this line, it is going to give the good result a = 3 tab = 1.0_dp do j = 1,2 tab = sin(tab) end do write (*,*) tab(n) end program main[/bash]
[fortran]module shared implicit none integer :: a end module shared [/fortran]Everything is compiled with
[bash]ifort -ipo -c module.f90 -o module.o ifort -ipo -c main.f90 -o main.o ifort -parallel module.o main.o -o main[/bash]It seems (I get it from -par-report2) that the compiler makes the loop "do j=1,2" parallel which is rather "surprising".
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Let me see if I can isolate this a bit more, find workarounds, and get a bug report going.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Workaround: use option -nolib-inline when compiling main.f90. Please try this on your real code too and let me know if that fixes the real application too.
Like you, I was surprised that ANY change to the code would make the problem go away. Thank you for a very compact reproducer!
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[fortran]module shared implicit none integer, parameter :: n = 1000 integer, dimension(n) :: tab integer :: a contains function f() integer :: f a = a+1 f = a end function f end module shared program main use shared implicit none integer :: i,j do j = 1,2 a = 0 do i = 1,n tab(i) = f() end do end do write (*,*) tab(n),a end program main[/fortran]When I compile it with
[bash]ifort -parallel -par-report2 test.f90 -o test[/bash]
[bash]ifort -ipo -parallel -par-report2 test.f90 -o test[/bash]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also, the inner loop should not be auto-parallelized (assuming no conflict with a and tab(i)) because it is one level nested from the outer loop. As you nest deeper, the parallization threashold increases dramatically.
I would suggest that you stop using auto parallization and start using explicit parallization (OpenMP). OpenMP integrates quite nicely with the compiler. Note, do not parallize your last example without understanding what you will get from the parallization.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[fortran]do i = 1,n a(i) = f(i) end do[/fortran]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suppose one of the reasons for ifort being the first major compiler to introduce do concurrent (the f2008 alternative to forall) is the prospect of better auto-parallel support. Unfortunately, at this time, it puts us in the situation of supporting multiple compilers with conditional compilation:
#if defined __INTEL_COMPILER
do concurrent( i= 1:n, a(i) > b(i))
a(i)= a(i)-b(i)*d(i)
c(i)= a(i)+c(i)
enddo
#else
forall( i= 1:n, a(i) > b(i))
a(i)= a(i)-b(i)*d(i)
c(i)= a(i)+c(i)
endforall
#endif
In practice, f77 code still is likely to perform best; besides, ifort OpenMP requires the f77 to support parallellization.
Generally speaking, ifort has less f2008 support than others:
#ifndef __GFORTRAN__
junk=system("uname -ps > uname.txt")
#else
call execute_command_line("uname -ps > uname.txt")
#endif
The above seem to work with Open64 as well as ifort and gfortran, although I couldn't find documentation on the correct way for Open64.
As for pure, I suppose ifort -parallel wants to analyze all the source code, with opportunity for interprocedural optimization, rather than taking a chance on your pure assertion. If the code does comply with pure, that should improve prospects for parallel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Consider something like
module foo
integer(LONG) :: a
contains
function f()
use kernel32 ! or equivilentfor linux
integer(LONG):: f
f = InterlockedIncrement(a) ! or equivilentfor linux
end function f
end module foo
...
!$omp parallel do shared(a), private(i)
do i=1,n
a(i) = f()
end do
!$omp end parallel do
...
The above is a perfectly valid parallization.
Each element of a receives a unique number
*** However, the values are not necessarily sequential
*** The values will be ascending per thread with each thread filling a slice of a
Although the output in array a will (may) differ between serial and parallel, this may be perfectly acceptible if your interest is in unique numbers. If this is not your interest, then do not parallize the loop in this manner.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
closing.
ron

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page