integer :: p, n, i
integer, parameter :: maxnth=100
integer, volatile :: prog(0:maxnth-1)
!$omp parallel private(p) shared(prog) num_threads(nth)
p = omp_get_thread_num()
do while(prog(p) .ge. prog(p+1))
prog(p) = prog(p)+1
!$omp end parallel
end subroutine pca[/fortran]
Assume this is part of a bigger code (i.e., there is much more code in the parallel region) and that the "do while" loop is used to synchronize all threads (no, I don't want to use a barrier :-) ). Is it possible that prog(p+1) is kept inside a register and its update is not seen by thread p ? Because of the "volatile" I would say no and in fact the code appears to be working. can anybody confirm?
The question may seem stupid but if prog is instead declared of size nth instead of maxnth, the code does not work anymore. Can anybody explain me the difference between these two cases? I guess that the difference is in the fact that nth is an argument of the routine but still this does not help me understand.
You can't expect one thread to see updates from other threads while still in the same parallel region; I doubt the volatile will affect that, so you have a race condition (the result from the neighboring thread could be from before or after it updates). It's possible that the false sharing situation you have created causes enough delay that the updates occur before your program gets there, but you would need a critical region to set any expectations. It seems strange that changing the array to automatic before you enter the parallel region would cause threads to have their own copies of a shared array, but all bets are off when you have an incorrect program. It reminds me of reported problems with private automatic arrays, but that's a different situation.
thanks for you answer. Maybe I don't fully understand what you wrote but I believe you're wrong in a couple of points.
- "You can't expect one thread to see updates from other threads while still in the same parallel region" : one thread can obviously see updates done by another thread on shared data (as the "prog" array is) in the parallel region. This is the basis of shared-memory parallelism.
- there is no race condition in the code: note that it is not possible that two threads update the prog array at the same time; each thread p spins inside the do-while loop until thread p+1 updates prog. Or, maybe you are talking about the fact that the read on line 12 and the write on line 16 may happen at the same time? that doesn't really represent a problem beause the expected behavior of the code is still preserved.
- critical sections do not make the code any better.
You are right when you say that the code is wrong. In fact it is wrong by the OpenMP 3.0 and 3.1 standards that completely ignore the volatile attribute. The code would be correct the OpenMP 2.5 standard, instead, which is consistent (as far as my understanding goes) with the language specification of the volatile attribute.
The difference between the automatic array case and the static array one is, maybe, that the compiler does different optimizations. In both cases the resulting behavior is "legal" according to the OpenMP standard but in one case it is different from what I expected. Note that with the GNU, IBM and CRAY compilers the code behaves as I expected in both cases.
The code becomes correct if !$omp flush directives are inserted in the while loop and after the write on prog but I hoped I could avoid this because flushes are expensive system calls whereas I wanted to have a fast synchronization between threads.
The same technique works like a charm if written with pthreads instead of OpenMP.
>>if prog is instead declared of size nth instead of maxnth, the code does not work anymore consider:
integer,volatile::prog(0:nth-1) ! was maxnth-1 ... dowhile(prog(p).ge.prog(p+1))
Then for the last thread (assuming nth == # threads), p+1 is beyond the end of the array. (i.e. will observe uninitialized/junk data instead of the initialized value of -1)
TimP - writes to an array of volatiles should drill through to memory, always. The compiler also should not reorder reads or writes to volatiles with respect to reads and writes to other volatiles. This is the whole point of volatiles (always read from or write to memory/IO bus here and now). Therefore the reads and writes from/to volatiles will be visible to other threads, not only in the parallel region but outside as well (assuming they have access to the memory address).
>>writes to an array of volatiles should drill through to memory, always.
Let me qualify this:
When the OpenMP directive declares (implicitly or explicitly) the volatile variable/array is SHARED (and no private copy is made therefrom) then the above holds true.
On the otherhand, when the variable/array is PRIVATE (private copy is made therefrom) then the above need not necessarily be true except when the PRIVATE copy is a pointer/reference to a location that itself is attributed as volatile, and in which cast the R/W drills through to the memory/IO bus.
VOLATILE initially came in to the programming language to resolve compiler optimization issues with respect to reading and writing to control registers/IO ports. The original purpose should not be violated by redefinition.
the do-while loop is inside aif(p.ne.nth-1) statement, therefore all the threads executing the loop (and thus checking the condition) have 0<= p <= nth-2. The index (p+1) does always fall withing the bounds of the prog array when prog is declared of size (0:nth-1).
As for waht you tell TimP, this is exactly what I thought but reading the OpenMP standard it does not appear to be true.
I started a discussion on the OpenMP forum on this topic but it is not leading any firther for the moment. Here's the link in case you want to check:
From the oldest reference book I have on my shelf:
Borland Turbo C Version 2.0 Copyright (c) 1988
"The volatile modifier, also defined by the ANSI standard, is almost the opposite of const. It indicates that the object may be modified; not only by you, but also by something outside of your program, such as an interrupt routine or and I/O port. Declaring an object to be volatile warns the compiler not to make assuptions concerning the value of the object while evaluating expressions containing it, since the value could (in theory) change at any moment. It also prevents the compiler from making the variable a register variable."
Granted this is C, not FORTRAN. I cannot concieve that the two ANSI standards committees would create conflicting definitions for volatile.
From Intel's man_for_lin.pdf
"Specifies that the value of an object is entirely unpredictable, based on information local to the current program unit. It prevents objects from being optimized during compilation."
The Intel definition seems rather relaxed and incomplet as compared to the Borland definition.
From openWatcom f77lr.pdf
"The VOLATILE statement is used to indicate that a variable of and element of an array may be updated concurrently by other code. A volatile variable or array element will not be cached (in a register) by the code generator. Each time a volatile variable or array element is updated, it is stored back into memory."
*** However this document also goes on to state in Note 2
"Dummy arguments, procedure names, and common block names are not permitted in a VOLATILE statement."
I find issue with the exclusion of "Dummy arguments" as this would exclude passing the reference of (to) a volatile variable and for use as volatile. I can accept a dummy argument passed by value having this exclusion, however I can also see a valid purpose for having the value being attributed with volatile. e.g. calling a subroutine with a volatile initial value to be used as shared variable in parallel region instantiated within the called subroutine.
From IBM XL Fortran V11.1
"The VOLATILE attribute is used to designate a data object as being mapped to memory that can be accessed by independent input/output processes and independent, asynchronously interrupting processes. Code that manipulates volatile data objects is not optimized."... "If an array name is declared volatile, each element of the array is considered volatile. ..." "If a derived type name is declared volatile, all variables declared within that type are considered volatile." "If an object of derived type is declared volatile, all of its components are considered volatile." "If a component of a derived type is itself derived, the component does not inherit the volatile attribute from its type." "A derived type name that is declared volatile must have had the VOLATILE attribute prior to any use of the type name in a type declaration statement." "If a pointer is declared volatile, the storage of the pointer itself is considered volatile. The VOLATILE attribute has no effect on any associated pointer targets." ... "Any data object that is shared across threads and is stored and read by multiple threads must be declared as VOLATILE. If, however, your program uses the automatic or directive-based parallelization facilities of the compiler, variables that have the SHARED attribute need not be declared VOLATILE."
*** I think this answers your question ***
"If the actual argument associated with a dummy argument is a variable that is declared volatile, you must declare the dummy argument to be considered volatile." "If a dummy argument is declared volatile, and you require the associated actual argument to be considered volatile, you must declare the actual argument as volatile"
*** This answers my objections to statements on this forum regarding dummy arguments and volatile
The also have a section on using -qxlf2003=volatile " Using -qxlf2003=volatile
If an actual argument is an array section or an assumed-shape array, and the corresponding dummy argument has the VOLATILE attribute, that dummy argument shall be an assumed-shape array.
If an actual argument is a pointer array, and the corresponding dummy argument has the VOLATILE attribute, that dummy argument shall be an assumed-shape array or a pointer array.
If the actual argument is an array section having a vector subscript, the dummy argument is not definable and shall not have the VOLATILE attribute.
Host associated entities are known by the same name and have the same attributes as in the host, except that an accessed entity may have the VOLATILE attribute even if the host entity does not.
In an internal or module procedure, if a variable that is accessible via host association is specified in a VOLATILE statement, that host variable is given the VOLATILE attribute in the local scope.
A use associated entity may have the VOLATILE attribute in the local scoping unit even if the associated module entity does not. "
*** So excepting for vector subscripts, volatile is definable