- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Everyone,
I'm trying to offload some computations to GPU using the OpenMP 4.5 feature. But couldn't successfully compile the code,
1>ifort: error #10036: unable to run 'C:\PROGRA~2\INTELS~1\COMPIL~4\linux\bin\intel64\ifort.exe'
1>ifort: error #10340: problem encountered when performing target compilation
I used the option /Qopenmp and /Qopenmp-offload. The compiler version is Intel(R) Visual Fortran Compiler 19.0.5.281 [Intel(R) 64] and platform is Windows 10.
I also tried /Qnextgen option, following the article here,
But I get new errors,
1>ifort: error #10408: The Intel LLVM Based compiler cannot be found in the expected location. Please check your installation or documentation for more information.
1>ifort: error #10036: unable to run 'C:\PROGRA~2\INTELS~1\COMPIL~4\windows\bin\ifx.exe'
Any suggestions would be appreciated. Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim, Intel did already implement what you suggest, some years ago. They called it "Cluster OpenMP". Nobody used it and it was quietly retired.
Just to make it clear for @yzh15 , Intel compilers do not support OpenMP offload to GPUs (NVidia/AMD or anyone else).
The Qopenmp-offload option requires that a separate toolkit for Xeon Phi development be installed. It included a completely separate compiler that is invoked by the ifort driver along with supporting software. If you don't have that, then the option will not work. I don't think this is a bug.
Just speculating here, given that Intel is developing a new series of coprocessors under the Xe-HPC name, it's possible that offloading to one of these could be in the future. I have zero inside knowledge of this, but it would make sense to me.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As far as I know, the regular Intel compiler doesn't support offload to GPUs, only the (now discontinued) Intel Xeon Phi coprocessors.
/Qnextgen requires that you have the new beta HPC compiler installed. I don't think it supports GPU offload either.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, out of curiosity: Can you show us a minimum (not)working example? I would be interested, which OMP pragmas you use.
With PSXE2020u2 (19.1.2.254) /Qopenmp /Qopenmp-offload:host is working fine for simple OMP pragmas:
program omp_test
use omp_lib
implicit none
integer :: i
!$omp parallel do default(none) private(i)
do i = 1, 8
write(*,*) omp_get_thread_num()
end do
!$omp end parallel do
end program omp_test
ifort /Qopenmp-offload:host /Qopenmp omp_test.f90
/Qopenmp-offload:mic requires a Xeon Phi as far as I understood.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Thanks for your time, the code is actually pretty simple,
real, allocatable :: a(:), b(:), c(:)
allocate(a(10), b(10), c(10))
a = 1.0
b = 2.0
call omp_set_num_threads(nthread)
!$omp target map(to: a, b) map(from:c)
!$omp parallel do private(i)
do i=1,10
c(i) = a(i) + b(i)
enddo
! $omp end target
The main thing is to relocate the computations to the device. My Fortran compiler also builds a regular OpenMP program fine. It's giving me error only when I use 'target' directive. I also tried the C program in the link I provided, my C compiler (icl) can also successfully compile the code even with 'target'. So the problem seems only come from the Fortran compiler.
Do you mean Intel compiler's /Qopenmp-offload only supports MIC architecture ?
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I can reproduce the error with your code with 19.1.2, if I choose /Qopenmp-offload. With /Qopenmp-offload:host it is compiling and running. Whatever it offloads to where...
Maybe you can open a ticket for that. I assume that the target is ignored, if host is given as offload. The path in the error message containing linux on Windows OS compiler and pointing to an exe sounds strange.
The complete code:
program main
use omp_lib
implicit none
integer :: i
integer, parameter :: nthread = 8
real, allocatable :: a(:), b(:), c(:)
allocate(a(10), b(10), c(10))
a = 1.0
b = 2.0
call omp_set_num_threads(nthread)
!$omp target map(to: a, b) map(from:c)
!$omp parallel do private(i)
do i=1,10
c(i) = a(i) + b(i)
write(*,*) omp_get_thread_num()
enddo
!$omp end target
end program main
The error (ifort /Qopenmp /Qopenmp-offload omp_test.f90):
1>------ Build started: Project: omp_test, Configuration: Debug x64 ------
1>Compiling with Intel(R) Visual Fortran Compiler 19.1.2.254 [Intel(R) 64]...
1>omp_test.f90
1>ifort: error #10037: could not find 'C:\PROGRA~2\INTELS~1\CO1815~1\linux\bin\intel64\ifort.exe'
1>ifort: error #10340: problem encountered when performing target compilation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Offloading on Intel compiler is only supported for Intel Xeon Phi Knights Corner (51xx, and 71xx series) of coprocessors. These coprocessors run a version of linux and perform the compilation of the offloaded section of code using a version of the compiler inside the coprocessor. The error message indicates that the linux version of the compiler (that is to be injected into the (missing) Xeon Phi) was not found on your system.
While this subject of Offloading is presented here...
I have the following suggestion for Intel that should be relatively easy to implement, and which I think will gain popularity from the users.
Make a derivative of your KNC OpenMP offload, that offloads NOT to an installed coprocessor, but rather offloads to a fabric attached host using the MPI API (hidden in the OpenMP directives).
While the programmer can convert an application from single process to multi-process it is non-trivial.
Additionally, converting an application to make use of co-arrays is also non-trivial.
IMHO, converting an OpenMP single process into OpenMP multi-process would be near trivial
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim, Intel did already implement what you suggest, some years ago. They called it "Cluster OpenMP". Nobody used it and it was quietly retired.
Just to make it clear for @yzh15 , Intel compilers do not support OpenMP offload to GPUs (NVidia/AMD or anyone else).
The Qopenmp-offload option requires that a separate toolkit for Xeon Phi development be installed. It included a completely separate compiler that is invoked by the ifort driver along with supporting software. If you don't have that, then the option will not work. I don't think this is a bug.
Just speculating here, given that Intel is developing a new series of coprocessors under the Xe-HPC name, it's possible that offloading to one of these could be in the future. I have zero inside knowledge of this, but it would make sense to me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>Intel did already implement what you suggest, some years ago. They called it "Cluster OpenMP". Nobody used it...
I am going to assume that Intel marketing misguidedly targeted the HPC users that already had their applications written as MPI applications. IOW more work to port to use "Cluster OpenMP"
My (resurected) suggestion is targeted at the users who's applications are written for the desktop/workstation and where they may have additional desktops, workstations, and/or server(s) available .AND. they would like not to incur a large development effort to make use of the additional processing capacity.
Perhaps it is time for Intel marketing to survey their software vendors that produce OpenMP applications for use on desktop and workstations.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, it wasn't targeted at MPI users, but at OpenMP applications that wanted to distribute across more processors than were in the local system without recoding. Given that OpenMP assumes a shared address space, it is not clear to me this is worth the effort. I would rather see people use coarrays.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I like C# method of wrapping up a set of code and offloading it to another thread, with some timing it is easy to balance
using the GPU should be automatic - our problem here is competition gets in the way of real advancement.
I blame Borland myself for his cheap compilers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
So you mean none of Intel compilers actually support offloading to real GPU, e.g. NVIDIA or AMD GPU ? The /Qopenmp option only supports offloading to Intel Xeon Phi device ?
Thanks very much!

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page