- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I have tried to use tree threads for CPU and MIC cooperation.
But there is error "offload error: process on the device 0 was terminated by signal 11 (SIGSEGV)"
The compile operation is "ifort mic.f90 -mkl -openmp". The codes are as follow:
program mic
use mic_lib
use omp_lib
implicit none
integer::mics,idx
DOUBLE PRECISION,allocatable::A(:)
DOUBLE PRECISION,allocatable::B(:)
DOUBLE PRECISION,allocatable::C(:)
allocate(A(256*256))
allocate(B(256*256))
allocate(C(256*256))
mics = offload_number_of_devices()
!dir$ attributes offload:mic :: DGEMM
!$OMP PARALLEL PRIVATE(idx) NUM_THREADS(mics+1)
!$OMP DO SCHEDULE (static)
do idx=0,mics
if(idx==mics) then
CALL DGEMM('N','N',256,256,256,1.d0,A,256,B,256,0.d0,C,256)
else
!dir$ offload target(mic:idx) in(A,B:length(256*256)) out(C:length(256*256))
CALL DGEMM('N','N',256,256,256,1.d0,A,256,B,256,0.d0,C,256)
end if
end do
!$OMP END DO
!$OMP END PARALLEL
deallocate(A)
deallocate(B)
deallocate(C)
end program mic
zhou
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
One of our experts looked at your problem.
He believes that this may be a bug and has submitted a bug report.
As experts are want to do, he made some recommendations as well: The threads may overwrite each other when they write back to the host; and it is a good idea to initialize your data when you allocate it. None of these will cause the issue you observe.
Regards
--
Taylor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have exactly the same error. How do I fix it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the case cited in the original post the compiler mishandles the constants (256, 0.d0, 1.d0) for the offload region treating them as read-write variables instead of read-only. That leads to the seg-fault when exiting from the offload region trying to write their values back to the host.
The issue is expected to be fixed in the coming Update 3 later this month. To work around you would unfortunately need to define variables to pass in the constants, but do not declare with parameter as those are disallowed in IN/OUT/INOUT. Something like this enables the case to run successfully:
integer :: size=256
double precision:: zero_Dbl=0.d0
double precision:: one_Dbl=1.d0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The 14.0 update 3 compiler (14.0.3.174) containing the fix has now been posted and is available for download from the Intel registration center.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks very much for all of your replies.
The code can implement well on our system now.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page