Software Archive
Read-only legacy content
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
17060 Discussions

Offload compilation problem with -openmp option.

Zoltan_P_
Beginner
688 Views

Hi all!

I have problems using openmp and offload directives. The following (reduced) code give right result (1  2  3  4  5  0  0  0  0  0), when it's compiled without openmp ("ifort test.f -o test"), and wrong (1  2  3  4  5  6  7  8  9 10) with openmp ("ifort -openmp test.f -o test").

 

      PROGRAM test
      integer c(10)

      c=0

!DIR$ OFFLOAD_TRANSFER target(mic:0)
     & nocopy(c: length(10) alloc_if(.true.) free_if(.false.))

!DIR$ OFFLOAD begin target(mic:0) nocopy(c)
      do i=1,10
      c(i)=i
      enddo
!DIR$ end OFFLOAD

!DIR$ OFFLOAD_TRANSFER target(mic:0)
     & out(c(1:5): alloc_if(.false.) free_if(.false.) into(c(1:5)))

!DIR$ OFFLOAD_TRANSFER target(mic:0)
     & nocopy(c: alloc_if(.false.) free_if(.true.))

      WRITE(*,'(10(1X,I2))'), c

      END PROGRAM

 

"OFFLOAD_REPORT" when result is wrong: (not full)

[Offload] [MIC 0] [File]            test.f
[Offload] [MIC 0] [Line]            9
[Offload] [MIC 0] [Tag]             Tag 1
[Offload] [HOST]  [Tag 1] [State]   Start Offload
[Offload] [HOST]  [Tag 1] [State]   Initialize function __offload_entry_test_f_9MAIN__ifort1104196052Lk2WXB
[Offload] [HOST]  [Tag 1] [State]   Send pointer data
[Offload] [HOST]  [Tag 1] [State]   CPU->MIC pointer data 0
[Offload] [HOST]  [Tag 1] [State]   Gather copyin data
[Offload] [HOST]  [Tag 1] [State]   CPU->MIC copyin data 0 
[Offload] [HOST]  [Tag 1] [State]   Compute task on MIC
[Offload] [HOST]  [Tag 1] [State]   Receive pointer data
[Offload] [HOST]  [Tag 1] [State]   MIC->CPU pointer data 0
[Offload] [MIC 0] [Tag 1] [State]   Start target function __offload_entry_test_f_9MAIN__ifort1104196052Lk2WXB
[Offload] [HOST]  [Tag 1] [State]   Scatter copyout data
[Offload] [HOST]  [Tag 1] [CPU Time]        0.001025(seconds)
[Offload] [MIC 0] [Tag 1] [CPU->MIC Data]   0 (bytes)
[Offload] [MIC 0] [Tag 1] [MIC Time]        0.000210(seconds)
[Offload] [MIC 0] [Tag 1] [MIC->CPU Data]   40 (bytes)

 

Offload region at line 9 do data transfer, when nocopy is set...

(composer xe 2013 sp1.2.144; ifort version 14.0.2)

(This is a reduced code, which reproduce the error. My full code contain openmp directives.)

0 Kudos
6 Replies
jimdempseyatthecove
Honored Contributor III
688 Views

Can you include the OpenMP directives in your simple example.

In particular, is your parallel region between lines 5 and 20 or within the 2nd offload as a parallel do?

Jim Dempsey

0 Kudos
Kevin_D_Intel
Employee
688 Views
Thank you for the convenient reproducer; I confirmed the failure. This appears to be fixed in the next major release, planned for later this year (and beta starting soon); however, it still fails with the upcoming update (later this month) to the CXE 2013 SP1 you currently have. I am also unable to find a work around but will inquire w/our Developers to see whether something exists.
0 Kudos
Kevin_D_Intel
Employee
688 Views

I submitted the issue to Development (see internal tracking id below) to determine whether a fix is possible for the CXE 2013 SP1 (14.0 compiler) release.

I do not know whether any of these is usable in your original code, but for the test case, using either an allocatable array, or adding SAVE, or using other means to force array C into static storage with -openmp avoids the incorrect results.

If you are interested, the Beta program has been announced here: Invitation to join the Intel® Software Development Tools 2015 Beta program

(Internal tracking id: DPD200255733)

0 Kudos
Zoltan_P_
Beginner
688 Views

Thanks for replies!

I would like to use multiple MIC device with OpenMP, so MIC regions is inside a OpenMP region. Originally these were in the same subroutine, but after separation, I could compile MIC code without -openmp compiler option. It's work, but I would like to compile all code with the same options, if it's possible.

0 Kudos
jimdempseyatthecove
Honored Contributor III
688 Views

Zoltan,

RE Kevin's: I do not know whether any of these is usable in your original code, but for the test case, using either an allocatable array, or adding SAVE, or using other means to force array C into static storage with -openmp avoids the incorrect results.

You can also use AUTOMATIC, an Intel specific attribute, though -openmp should (will) make local arrays on stack, and may be redundant for the purpose of allocation, but may have the side effect of fixing the compiler bug. Note, for large single instance arrays you can also use ALLOCATABLE, SAVE (places the descriptor in save area).

Be aware that when multiple host OpenMP threads enter an offload to the same MIC, that this is somewhat equivalent to nested parallel regions. There is nothing wrong with doing this provided each host-to-MIC entry is programmed to use a subset of the available threads on the MIC. Failure to do so may yield unacceptable results.

Jim Dempsey

0 Kudos
Kevin_D_Intel
Employee
688 Views

Zoltan - It sounds like you have been able to program around this issue. Our Developers confirmed there is a fix in the release scheduled for later this year and have asked whether a fix is required in a future update for the current Composer XE 2013 SP1 release.
Please let me know if your work around is sustainable until our release later this year.
Thank you

0 Kudos
Reply