Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

OMP allocatable Data Race

hentall_maccuish__ja
New Contributor II
291 Views

Hello,

Intel inspector detects a data race while writing to different components of a derived type with allocatable components. I'm guessing that there is something wrong with what I am doing, maybe related to the unspecified size of the components,  but it is hard to work out what should be the design for what I want to do when I'm only guessing at what might be problematic. I have a simple replicating example below. The data race is detected on the line modelObjects(typesim)%policy = modelObjectsLoc%policy. Naively I wouldn't expect this to be a data race as typesim is thread private at this level so each thread is trying to write to different elements of modelObjects(:). Is this a data race? And if so what design should I be using? I know in the simple example I could avoid using allocatalbe but in the actual program the size of  modelObjectsLoc%policy(ixa)%coo(:) is determined at the level of the inner loop and the size of modelObjectsLoc%policy(:) varies between steps of the outer loop. I guess what I need to know is how to avoid data races whilst having each thread in a team writing variable sized objects to different parts of a larger shared object.

Thanks,

    program Console1

    implicit none

    type sparseCOOType
        integer (kind=1) :: col
        integer (kind=1) :: row
        real (kind=4):: val
    end type sparseCOOType

    type policyType
        type(sparseCOOType), allocatable :: COO(:)
    end type policyType

    type modelObjectsType
        type(policyType), allocatable :: policy(:)
    end type modelObjectsType

    type (modelObjectsType) :: modelObjects(4)
    type (modelObjectsType) :: modelObjectsloc

    integer :: ixa, typesim
#IFDEF _OPENMP
    call omp_set_max_active_levels( 2 )
#ENDIF
    !$OMP PARALLEL DEFAULT( NONE ), SHARED(modelObjects ), PRIVATE(modelObjectsLoc, typeSim ), NUM_THREADS(4)
    !$OMP DO
    do typeSim =  1, 4

        allocate(modelObjectsLoc%policy(100))

        !$OMP PARALLEL DEFAULT( NONE ), SHARED(modelObjectsloc, typeSim ), PRIVATE(ixA), NUM_THREADS(8)
        !$OMP DO SCHEDULE(DYNAMIC)
        do ixA = 1, 100
            allocate(modelObjectsLoc%policy(ixa)%coo(5))
            modelObjectsLoc%policy(ixa)%coo(5)%row = 1
            modelObjectsLoc%policy(ixa)%coo(5)%col = 1
            modelObjectsLoc%policy(ixa)%coo(5)%val = 1.0
        end do
        !$OMP END DO NOWAIT
        !$OMP END PARALLEL

        allocate(modelObjects(typesim)%policy(100))
        modelObjects(typesim)%policy = modelObjectsLoc%policy
        deallocate(modelObjectsLoc%policy)

    end do
    !$OMP END DO NOWAIT
    !$OMP END PARALLEL

    end program Console1

 

0 Kudos
1 Solution
Ulrich_M_
New Contributor I
261 Views

I agree with you, there shouldn't be a data race.

Having said that, in my experience, it doesn't take much to generate issues in OMP sections of otherwise fine code. For instance, parametrized derived types don't work correctly, nor do blocks (or, more precisely, I ran into issues, which might or might not be fixed by now).

For that reason, I try to keep everything in an OMP section as simple as possible. Do you still have a data race if you work on

modelObjects

directly, rather than on

modelObjectsLoc

and then copying the results as you currently do?

View solution in original post

5 Replies
Ulrich_M_
New Contributor I
262 Views

I agree with you, there shouldn't be a data race.

Having said that, in my experience, it doesn't take much to generate issues in OMP sections of otherwise fine code. For instance, parametrized derived types don't work correctly, nor do blocks (or, more precisely, I ran into issues, which might or might not be fixed by now).

For that reason, I try to keep everything in an OMP section as simple as possible. Do you still have a data race if you work on

modelObjects

directly, rather than on

modelObjectsLoc

and then copying the results as you currently do?

View solution in original post

hentall_maccuish__ja
New Contributor II
216 Views

Thanks, this was my original solution in the program which this is a replicate example of. I went to having modelObjectsLoc to try and sort out another data race I didn't understand but I returned to directly working on modelObjects in both this example program (updated code below) and the main program and don't have a data race in either anymore. A lots changed in the main program and since I can't replicate the old data races it's probably best to forget about but I would like to understand why I get a data race here  with the local version modelObjectsLoc. 

    program Console1
        implicit none

    type sparseCOOType
        integer (kind=1) :: col
        integer (kind=1) :: row
        real (kind=4):: val
    end type sparseCOOType

    type policyType
        type(sparseCOOType), allocatable :: COO(:)
    end type policyType

    type modelObjectsType
        type(policyType), allocatable :: policy(:)
    end type modelObjectsType

    type (modelObjectsType) :: modelObjects(4)
    type (modelObjectsType) :: modelObjectsloc

    integer :: ixa, typesim
#IFDEF _OPENMP
    call omp_set_max_active_levels( 2 )
    !call OMP_set_nested(.true.)
#ENDIF
    !$OMP PARALLEL DEFAULT( NONE ), SHARED(modelObjects ), PRIVATE( typeSim ), NUM_THREADS(2)
    !$OMP DO
    do typeSim =  1, 4
        !write (*,*) typesim
        !allocate(modelObjectsLoc%policy(100))
        allocate(modelObjects(typesim)%policy(1000))

        !$OMP PARALLEL DEFAULT( NONE ), SHARED(modelObjects, typeSim ), PRIVATE(ixA), NUM_THREADS(4)
        !$OMP DO SCHEDULE(DYNAMIC)
        do ixA = 1, 1000
            allocate(modelObjects(typesim)%policy(ixa)%coo(5))
            modelObjects(typesim)%policy(ixa)%coo(5)%row = 1
            modelObjects(typesim)%policy(ixa)%coo(5)%col = 1
            modelObjects(typesim)%policy(ixa)%coo(5)%val = 1.0
        end do
        !$OMP END DO NOWAIT
        !$OMP END PARALLEL
        !write (*,*) typesim
        
        !modelObjects(typesim)%policy = modelObjectsLoc%policy
        !deallocate(modelObjectsLoc%policy)

    end do
    !$OMP END DO NOWAIT
    !$OMP END PARALLEL

    end program Console1

 

jimdempseyatthecove
Black Belt
240 Views

What happens when you remove the NOWAIT?

IOW test to see if the NOWAIT applied to the END PARALLEL of the nested loop (and in which case the

modelObjects(typesim)%policy = modelObjectsLoc%policy

could experience a race condition.

Additional note. The compiler optimization now is quite smart at removing code that generates results that are not used. You have to be crafty in constructing your test code such that it does what it appears to do.

Jim Dempsey

hentall_maccuish__ja
New Contributor II
215 Views

Hello Jim,

But why should the NOWAIT generate a data race? Even if one thread moves to it next allocated value of index typesim then it will still be different to the ones the other threads have and so:

modelObjects(typesim)%policy = modelObjectsLoc%policy

should be writing to different parts of the memory, shouldn't it?

Thanks, 

jimdempseyatthecove
Black Belt
197 Views

Your code sample contained

!$OMP END DO NOWAIT
!$OMP END PARALLEL

The above ought to have an implicit barrier at the END PARALLEL.

*** However, due to no code following END DO NOWAIT, the supposition for you testing (removing NOWAIT) was for in the event that the compiler applied the NOWAIT to the PARALLEL region you would then have a situation where

modelObjects(typesim)%policy = modelObjectsLoc%policy

could occur concurrent with threads within the parallel region not done modifying their slice of modelObjectsLoc%policy array (while you are copying the entire array).

... so .... did you try the test?

Alternatively, as silly as this looks, you could try

!$OMP END DO NOWAIT
!$OMP BARRIER
!$OMP END PARALLEL

*** Note, IIF that fixes the race condition, then this wold be indicative of the NOWAIT having threads .NOT. waiting not implicit barriering at the END PARALLEL (which would be a compiler bug).

Sometimes one must assert that the compiler is doing what it is assumed to be doing.

Jim Dempsey

Reply