- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
All,
I'm trying to port some code from using the Intel Fortran MIC directives to OpenMP 4 and it just hasn't been working for me. So, I decided to step back and try out some of the examples from OpenMP themselves. I figure if I can't figure out and run those, I'm stuck anyway. So, from the examples (PDF) I decided to start with Example 49.5f as it comes closest to sort of looking like real-life code.
So, I transcribed it and added a couple extra routines (init and output):
[fortran]module utils
implicit none
contains
subroutine init(v1, v2, N)
implicit none
real, dimension(:) :: v1, v2
integer :: N, i
v1 = 2.0
v2 = 4.0
end subroutine init
subroutine output(p, N)
implicit none
real, dimension(:) :: p
integer :: N, i
write (*,*) "p(1): ", p(1)
end subroutine output
end module utils
module my_mult
use utils
implicit none
contains
subroutine foo(p0,v1,v2,N)
implicit none
real, dimension(:) :: p0, v1, v2
integer :: N, i
call init(v1, v2, N)
!$omp target data map(to: v1, v2) map(from: p0)
call vec_mult(p0,v1,v2,N)
!$omp end target data
call output(p0, N)
end subroutine foo
subroutine vec_mult(p1,v3,v4,N)
implicit none
real, dimension(:) :: p1, v3, v4
integer :: N, i
!$omp target map(to: v3, v4) map(from: p1)
!$omp parallel do
do i = 1, n
p1(i) = v3(i) * v4(i)
end do
!$omp end target
end subroutine vec_mult
end module my_mult
program main
use my_mult
implicit none
!integer, parameter :: N = 1024*1024*1024
integer, parameter :: N = 1024*1024
real, allocatable, dimension(:) :: p, v1, v2
allocate( p(N), v1(N), v2(N) )
call foo(p, v1, v2, N)
deallocate( p, v1, v2 )
end program main[/fortran]
When I run this without OpenMP and it works okay, but when I add in -openmp it stalls out and I have to Ctrl-C
(1002) $ ifort -V
Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 14.0.1.106 Build 20131008
Copyright (C) 1985-2013 Intel Corporation. All rights reserved.
(1003) $ ifort 49.5f.F90
(1004) $ ./a.out
p(1): 8.000000
(1005) $ ifort -openmp 49.5f.F90
(1006) $ ./a.out
[Offload] [MIC 0] [File] 49.5f.F90
[Offload] [MIC 0] [Line] 33
[Offload] [MIC 0] [Tag] Tag 0
[Offload] [HOST] [Tag 0] [CPU Time] 0.048183(seconds)
[Offload] [MIC 0] [Tag 0] [MIC Time] 0.000270(seconds)
[Offload] [MIC 0] [File] 49.5f.F90
[Offload] [MIC 0] [Line] 44
[Offload] [MIC 0] [Tag] Tag 1
Now, my guess is the first set of offload notifications are due to the target data. The second would be the target...and I guess it doesn't work?
I suppose my question now is: did I do something wrong? As I've never actually gotten OpenMP 4 + MIC to work, I don't have a baseline to work from.
Thanks,
Matt
Link kopiert
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
From a quick first test I am seeing the same behavior. Let me investigate and reply again after learning more.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Glad to hear it's not just me!
And, I suppose, this is fair warning that many an OpenMP 4+MIC question might be incoming. (Preview: is there a way with OpenMP to have a single code that can run on the host or on the MIC given compiler options or preprocessor directives? I'm not sure there is... ETA: Ahhh...the if clause!)
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
As I wait for Kevin, here's another question. I thought I'd try out this simple code (based on Example 55.2f) on a machine that is Westmeres and nothing else. No GPUs, no MICs, just a plain ol' compute node:
[fortran]program test
use omp_lib
implicit none
logical :: do_offload
integer :: num_devices
num_devices = omp_get_num_devices()
write (*,*) 'num_devices: ', num_devices
do_offload = num_devices > 0
write (*,*) 'do_offload: ', do_offload
end program test[/fortran]
My thought was let's test if I can use something like the OMP if clause to ignore target statements on a non-MIC platform. However:
$ ifort test.F90
/gpfsm/dnb31/tdirs/pbs/slurm.434493.mathomp4/ifortYcM7wm.o: In function `MAIN__':
test.F90:(.text+0x3b): undefined reference to `omp_get_num_devices'
$ ifort -openmp test.F90
ifort: warning #10362: Environment configuration problem encountered. Please check for proper MPSS installation and environment setup.
x86_64-k1om-linux-ld: No such file or directory
Is this expected behaviour? Perhaps it's due to how Intel 14 is loaded on our cluster (via modules and it has some MIC stuff setup in the environment)?
Thanks,
Matt
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
The omp_get_num_devices() invokes the offload compilation so the warning message about MPSS can be expected.
With the Intel offload feature, __MIC__ is defined when the offload compilation occurs. I do not see an equivalent for OpenMP 4.0 offhand. Let me check on this. I'm also looking into your earlier inquiry about features for having single code for host/co-processor. I'm more familiar with our own offload than the new OpenMP 4.0 so hopefully you can bear with me.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Kevin,
No worries. I'm learning this myself. Your comment gave me a thought:
[fortran]#ifdef __INTEL_OFFLOAD
num_devices = omp_get_num_devices()
#else
num_devices = 0
#endif
write (*,*) 'num_devices: ', num_devices[/fortran]
Now I can have some control over that section with -no-offload:
(mic node) $ ifort -openmp test.F90
(mic node) $ ./a.out
num_devices: 1
do_offload: T
(no mic node) $ ./a.out
num_devices: 0
do_offload: F
(any node) $ ifort -openmp -no-offload test.F90
(mic node) $ ./a.out
num_devices: 0
do_offload: F
(no mic node) $ ./a.out
num_devices: 0
do_offload: F
It's not perfect, but it's a step in the right direction. You are just required to compile on a MIC-enabled node if you think you'll need offloading. If not, -no-offload could allow for more expansive compiling. Kind of what I have to do for CUDA as well, though it's probably time to overload all these preproc macros to __ACCEL__ or the like.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
For what it's worth, I wrote a series of small test cases (with no use of module) using OpenMP 4 for target offload, both using separate target update directives for data transfer and using target map. Each of the cases produces correct results in at least one or the other version, but there are some incorrect results, including a case which shouldn't offload as there is an unsatisfied if() on the omp target directive. I expected performance differences between the two approaches, but didn''t see them.
This was on relatively new hardware, to which I will soon lose access. I haven't checked omp target on the older hardware (Westmere, KNC B0) to which I expect to retain access.
I'm also in a learning stage not knowing whether I made mistakes or why it doesn't work as I expected. I was intending to try C after Fortran; maybe I should go ahead when time permits.
Most discussions relating to MIC are undertaken on the MIC specific forum, but I haven't seen any discussions on this subject there.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
My test cases with ifort omp target began to work correctly (but not efficiently) with the ifort 15.0 release. Intel C and C++ still is a problem for me.
I've run lots of tests of OpenMP 4 for host and MIC native, linux and windows, Intel and gnu compilers, examples at
https://github.com/tprince/lcd
and discussion at https://sites.google.com/site/tprincesite/parallel-optimization
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
>>>My thought was let's test if I can use something like the OMP if clause to ignore target statements on a non-MIC platform.
omp_is_initial_device() should function to have the host execute statements inside !$omp target when there is no attached device. Unfortunately it's not yet implemented in icc/ifort 15.0. I tested the following on a non-MIC platform.
$ cat get_host_tgt.f90
program hosttarget
use omp_lib, ONLY: omp_is_initial_device
implicit none
!$omp target
if( omp_is_initial_device() ) then
print *,' running on host without attached device'
else
print *,' running on device attached to host'
endif
!$omp end target
end program hosttarget
$ ifort -V
Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0 Build 20141028
Copyright (C) 1985-2014 Intel Corporation. All rights reserved.
$ ifort -fopenmp get_host_tgt.f90
get_host_tgt.f90(2): error #6580: Name in only-list does not exist. [OMP_IS_INITIAL_DEVICE]
use omp_lib, ONLY: omp_is_initial_device
-------------------^
get_host_tgt.f90(6): error #6404: This name does not have a type, and must have an explicit type. [OMP_IS_INITIAL_DEVICE]
if( omp_is_initial_device() ) then
-------^
get_host_tgt.f90(6): error #6341: A logical data type is required in this context. [OMP_IS_INITIAL_DEVICE]
if( omp_is_initial_device() ) then
-------^
compilation aborted for get_host_tgt.f90 (code 1)
$
We need to catch up with gcc/gfortran here, and certainly this is needed for OpenMP 4.0 completeness, so I'll report this to the developers.
$ gfortran --version
GNU Fortran (GCC) 4.9.1
$ gfortran -fopenmp get_host_tgt.f90 && ./a.out
running on host without attached device
$
Patrick
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
>>> I'll report this to the developers
Internal tracking # DPD200362637
Patrick
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Thank you for your note Tim. I am delinquent in updating the thread regarding Matt’s original case. My apologies Matt. Matt’s initial case is now working with the newest IPS XE 2015 (15.0) initial release only. We fixed the underlying issue only in this newer release.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
omp_is_initial_device() is now implemented in the Composer XE 2015 update 2 compiles, so I am closing this thread now.
The following block will execute on the target device if ONTGT is defined for the compilation; otherwise it executes on the host:
#ifdef __MIC__ num_thr = omp_get_num_threads() whatdev = omp_is_initial_device() !$omp single print *,' Compiled with OFFLOAD compiler...' print *,' Running on DEVICE with',num_thr,' threads and...' print *,' ...omp_is_initial_device() is ',whatdev !$omp end single #else num_thr = omp_get_num_threads() whatdev = omp_is_initial_device() !$omp single print *,' Compiled with OFFLOAD compiler...' print *,' Running on HOST with',num_thr,' threads and...' print *,' ...omp_is_initial_device() is ',whatdev !$omp end single #endif
[DPD200362637]$ ifort -V
Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.2.164 Build 20150121
Copyright (C) 1985-2015 Intel Corporation. All rights reserved.
[DPD200362637]$ ifort -qopenmp -fpp only_host-dcltgt.f90 -o only_host-dcltgt.f90-ifort.x
[DPD200362637]$ ./only_host-dcltgt.f90-ifort.x
Compiled with OFFLOAD compiler...
Running on HOST with 32 threads and...
...omp_is_initial_device() is T
[DPD200362637]$ ifort -qopenmp -fpp only_host-dcltgt.f90 -o only_host-dcltgt.f90-ifort.x -DONTGT
[DPD200362637]$ ./only_host-dcltgt.f90-ifort.x
Compiled with OFFLOAD compiler...
Running on DEVICE with 224 threads and...
...omp_is_initial_device() is F
[DPD200362637]$
Patrick

- RSS-Feed abonnieren
- Thema als neu kennzeichnen
- Thema als gelesen kennzeichnen
- Diesen Thema für aktuellen Benutzer floaten
- Lesezeichen
- Abonnieren
- Drucker-Anzeigeseite