- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the following program segfaults after being compiled with ifort -openmp (v. 11.1) on both macs and Linux
program test
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot
allocate(ivar(5000))
itot=0
!$omp parallel do default(none) private(ivar) shared(itot)
do i=1,5000
ivar(i)=i
!$omp atomic
itot=itot+ivar(i)
enddo
!$omp end parallel do
deallocate(ivar)
write(6,*) itot
end
the same works if compiled with gfortran and pgi.
Any ideas?
Best regards,
Pier
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In fact the example is an extreme simplification of a much more complex routine (in particular a hierarchical quadrature routine which is part of a partial electrical equivalent circuit code).
I think, although I'm not a big OpenMP expert, that ifort is incorrectly handling something which is allowed by the OpenMP 3.0 specification.
The code gets compiled with an absolutely minimal command line (ifort -openmp test.f90) and the same minimal options are used for gfortran (gfortran -fopenmp test.f90) and for pgi. The same code runs also without problems with the aix/xlf combination.
I would be extremely grateful if somebody could tell me if the shown code is "wrong" (in the sense that it is not standard-compliant) and other compilers work just by luck or if there is indeed a problem with ifort 11.1
Thanks a lot!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
atomic is defined as updating a single memory location in a thread safe manner, while, under vectorization, you may be updating 4 or 8 memory locations.
I don't know if that is the answer, but it seems you are assuming that atomic will work the same as critical or reduction.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(conversely "gfortran -O3 -ftree-vectorize -fopenmp test.f90" works)
Furthermore, this even more simplified program also crashes:
program test
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot
allocate(ivar(5000))
itot=0
!$omp parallel do default(none) private(ivar) shared(itot)
do i=1,5000
ivar(i)=i
enddo
!$omp end parallel do
deallocate(ivar)
write(6,*) itot
end
so the "atomic" part is not the cause. What is causing problems is the "allocatable" ivar. If that is substituted with a statically allocated array everything works.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1) the problem occurs even if 5000 is modified to 5 and OMP_NUM_THREADS=1
2) the sequential version (without -openmp) runs ok (I guess these two case would allocate the same amount of memory through the same mechanism)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
program test
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot
allocate(ivar(5))
itot=0
!$omp parallel do default(none) private(ivar) shared(itot)
do i=1,5
ivar(i)=i
!$omp atomic
itot=itot+ivar(i)
enddo
!$omp end parallel do
deallocate(ivar)
write(6,*) itot
end
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot
allocate(ivar(5000))
itot=0
!$omp parallel do default(none) shared(ivar,itot) reduction(+:itot)
do i=1,5000
ivar(i)=i
! remove !$omp atomic
itot=itot+ivar(i)
enddo
!$omp end parallel do
deallocate(ivar)
write(6,*) itot
end
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
test2.f90(7): error #6761: An entity cannot appear explicitly in more than one clause per directive except that an entity can be specified in both a FIRSTPRIVATE and LASTPRIVATE clause. [ITOT]
!$omp parallel do default(none) shared(ivar,itot) reduction(+:itot)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Most importantly: other compilers (gfortran,pgi,mpxlf) work on the same code, so either there is a bug in ifort or the code is not standard-compliant and the other compilers work just by chance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1) the array in the loop was referenced slice-wise by the parallel do loop. Therefore the array was made shared. This saved stack space and more importantly the implicit array copy.
2) a reduction variable was used to reduce the number of atomic to the number of threads verses the number of iterations. Lacking other information about your code, a reduction variable was appropriate.
Including the reduction variable in the PRIVATE/SHARED clause was an error on my part.
RE: 2)
Unless the code in the loop is very large compared to the overhead of an ATOMIC you should try to avoid ATOMIC by use of reduction or code equivilent to the reduction clause (thread local storage or a mailbox collated at end of/after loop).
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
let me first thank you sincerely for helping me with this issue. I think that maybe due to problems in expressing myself precisely in english my question might not have been clear.
Let me try to formulate my question again:
I would like to know if the code shown below is legal, not if it as an efficient way of doing things or if there are are ways of obtaining the same result. Only if the code is legal.
If the code is legal ifort should run it and if it doesn't it means that ifort has a bug.
If on the other hand the code is not legal than I wonder why other compilers handle it as I would expect, but this is another matter.
Thanks again,
Pier
program test
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot
allocate(ivar(5))
itot=0
!$omp parallel do default(none) private(ivar) shared(itot)
do i=1,5
ivar(i)=i
!$omp atomic
itot=itot+ivar(i)
enddo
!$omp end parallel do
deallocate(ivar)
write(6,*) itot
end
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
all threads are receiving a new unallocated array descriptor for private(ivar)
When using firstprivate(ivar) this appears to be (for arrays)somewhat equivilent to shared.
IOW different descriptors appear to be used, pointing to same memory locations.
[fortran]program test
use omp_lib
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot, k
k=123
allocate(ivar(5))
write(*,*) loc(ivar(1)), loc(ivar), loc(itot), loc(k)
itot=0
!$omp parallel do default(none) firstprivate(ivar, k) shared(itot)
do i=1,5
!$omp critical
if(allocated(ivar)) then
write(*,*) "allocated"
write(*,*) omp_get_thread_num(), loc(ivar(1)), loc(ivar), loc(itot), loc(k)
deallocate(ivar)
else
write(*,*) "not allocated"
write(*,*) omp_get_thread_num(), loc(ivar), loc(itot), loc(k)
endif
!$omp end critical
if(allocated(ivar)) then
ivar(i)=i
!$omp atomic
itot=itot+ivar(i)
endif
enddo
!$omp end parallel do
if(allocated(ivar)) deallocate(ivar)
write(6,*) itot
end
3635664 3635664 1244676 1244672
allocated
0 3635664 3635664 1244676 1242080
not allocated
0 0 1244676 1242080
not allocated
1 3635664 1244676 10418784
not allocated
0 0 1244676 1242080
not allocated
1 3635664 1244676 10418784[/fortran]look at the 2nd argument in the write statements (3635664/0)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would like somebody from Intel would comment on this issue, which I continue to feel is a bug (comments on the openmp.org forum also indicate that the code is correct).
Running following code shows, as you mentioned above that ivar is never allocated!
gfortran and other compilers happily allocate and execute the code correctly...
program test
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot
allocate(ivar(5))
itot=0
!$omp parallel do default(none) private(ivar) shared(itot)
do i=1,5
if(allocated(ivar))then
write(6,*) 'allocated'
ivar(i)=i
!$omp atomic
itot=itot+ivar(i)
else
write(6,*) 'not allocated'
endif
enddo
!$omp end parallel do
deallocate(ivar)
write(6,*) itot
end
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
this does not work with ifort and works with all other compilers I use:
integer,allocatable,dimension(:)::ivar
allocate(ivar(5))
!$omp parallel do default(none) private(ivar) shared(itot)
...
!$omp end parallel do
deallocate(ivar)
this works with all compilers I use:
integer,allocatable,dimension(:)::ivar
allocate(ivar(5))
!$omp parallel do default(none) firstprivate(ivar) shared(itot)
...
!$omp end parallel do
deallocate(ivar)
so firstprivate instead of private for allocatable variables... without firstprivate ivar does not get allocated by ifort... if my understanding of the standard is correct firstprivate should not be required if I am not interested in the contents of ivar whe entering the parallel do...
I hope Intel will correct this!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your description of first private is not correct
[fortran]program test
use omp_lib
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot, k
k=123
allocate(ivar(5))
ivar = -1
write(*,*) loc(ivar(1)), loc(ivar), loc(itot), loc(k)
itot=0
!$omp parallel do default(none) firstprivate(ivar, k) shared(itot)
do i=1,5
!$omp critical
if(allocated(ivar)) then
write(*,*) "allocated"
write(*,*) omp_get_thread_num(), ivar(1)
ivar(1) = omp_get_thread_num()
else
write(*,*) "not allocated"
write(*,*) omp_get_thread_num(), loc(ivar), loc(itot), loc(k)
endif
!$omp end critical
if(allocated(ivar)) then
! *** ivar(i)=i
!$omp atomic
itot=itot+ivar(i)
endif
enddo
!$omp end parallel do
if(allocated(ivar)) deallocate(ivar)
write(6,*) itot
end
3635664 3635664 1244672 1244668
allocated
0 -1
allocated
0 0
allocated
1 0
allocated
0 1
allocated
1 0
[/fortran]You are likely having seperate array descriptors pointing to the same memory.
In following the lines after "allocated"
First line (from thread 0)shows the copied-in value of -1
Second line (from thread 0) shows the rewritten value of thread number
Third line (from thread 1) shows value written from thread 0, not different copy made from outside (copy-in), -1
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(a) PRIVATE Clause (Section 2.9.3.3, p. 90):
A new list item of the same type is allocated once for each implicit
task in the parallel region, or for each task generated by a task
construct, if the construct references the list item in any statement.
The initial value of the new list item is undefined. Within a parallel,
worksharing, or task region, the initial status of a private pointer is
undefined.
For a list item with the ALLOCATABLE attribute:
- if the list item is "not currently allocated", the new list item will have an initial state of "not currently allocated";
- if the list item is allocated, the new list item will have an initial state of allocated with the same array bounds.
(b) FIRSTPRIVATE Clause (Section 2.9.3.4, p. 92):
A list item that appears in a FIRSTPRIVATE clause is subject to the
private clause semantics described in Section 2.9.3.3 on page 89. In
addition, the new list item is initialized from the original list item
existing before the construct.
program test
use omp_lib
implicit none
integer,allocatable,dimension(:)::ivar
integer i,itot
allocate(ivar(6))
ivar(:)=-1
itot=0
!$omp parallel do default(none) firstprivate(ivar) shared(itot)
do i=1,5
if(allocated(ivar))then
write(6,*) 'allocated ',omp_get_thread_num(),ivar(6)
ivar(6)=omp_get_thread_num()
ivar(i)=i
!$omp atomic
itot=itot+ivar(i)
else
write(6,*) 'not allocated'
endif
enddo
!$omp end parallel do
deallocate(ivar)
write(6,*) itot
end
output from ifort:
allocated 0 -1
allocated 0 0
allocated 0 0
allocated 1 0
allocated 1 1
output from other compilers:
allocated 0 -1
allocated 0 0
allocated 0 0
allocated 1 -1
allocated 1 1
if you change the allocation from dynamic to static the output from ifort becomes correct
I keep thinking that ifort has is buggy and/or not specification-compliant with respect to allocatable arrays
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Piergiorgio,
Perhaps you have an older version of ifort. I did not see any indication of your specific version in the thread. If you have a version older than what is noted below that appears fixed, could you upgrade your 11.1 compiler to at least the last 11.1 update?
It appears there is a defect with ifort but it also that it has been fixed in the last 11.1 update (11.1.073 - 11.1 Update 7) and our current Fortran Compiler XE 2011 release.
The program cited in Piergiorgio's previous post produces the incorrect results noted with 11.1 (Intel 64) Linux compilers beginning with Version 11.1 Build 20100414 Package ID: l_cprof_p_11.1.072 and going back as far as Version 11.1 Build 20091130 Package ID: l_cprof_p_11.1.064.
The program produces correct results with the following compilers:
Intel Fortran Intel 64 Compiler XE for applications running on Intel 64, Version 12.0.0.084 Build 20101006(l_fcompxe_2011.0.084)
Intel Fortran Intel 64 Compiler Professional for applications running on Intel 64, Version 11.1 Build 20100806 Package ID: l_cprof_p_11.1.073 (a.k.a. 11.1 Update 7)
Confirmed incorrect results:
$ ifort -V -openmp u78716.f90
Intel Fortran Intel 64 Compiler Professional for applications running on Intel 64, Version 11.1 Build 20100414 Package ID: l_cprof_p_11.1.072
Copyright (C) 1985-2010 Intel Corporation. All rights reserved.
Intel Fortran 11.1-2739
GNU ld version 2.17.50.0.6-5.el5 20061020
$ export OMP_NUM_THREADS=2
$ OMP_NUM_THREADS=2
$ ./a.out
allocated 0 -1
allocated 0 0
allocated 0 0
allocated 1 0
allocated 1 1
15
Correct results:
$ ifort -V -openmp u78716.f90
Intel Fortran Intel 64 Compiler XE for applications running on Intel 64, Version 12.0.0.084 Build 20101006
Copyright (C) 1985-2010 Intel Corporation. All rights reserved.
Intel Fortran 12.0-1176
GNU ld version 2.17.50.0.6-5.el5 20061020
$ export OMP_NUM_THREADS=2
$ OMP_NUM_THREADS=2
$ ./a.out
allocated 0 -1
allocated 0 0
allocated 0 0
allocated 1 -1
allocated 1 1
15
$ ifort -V -openmp u78716.f90
Intel Fortran Intel 64 Compiler Professional for applications running on Intel 64, Version 11.1 Build 20100806 Package ID: l_cprof_p_11.1.073
Copyright (C) 1985-2010 Intel Corporation. All rights reserved.
Intel Fortran 11.1-2755
GNU ld version 2.17.50.0.6-5.el5 20061020
$ export OMP_NUM_THREADS=2
$ OMP_NUM_THREADS=2
$ ./a.out
allocated 0 -1
allocated 0 0
allocated 0 0
allocated 1 -1
allocated 1 1
15
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page