- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm currently trying to use intrinsics to speed up my application. However I'm having problems with some intrinsics that won't compile. This is some test code:
#pragma offload_attribute(push, target(mic)) #include "immintrin.h" void foo() { //_mm_clevict(0, _MM_HINT_T0); //_mm_prefetch(0, _MM_HINT_ET1); } #pragma offload_attribute(pop) int main() { #pragma offload target(mic:0) { foo(); } }
If I compile this with the commandline "icc test.cpp" and comment out the _mm_clevict line I'm getting a "undefined reference to `_mm_clevict'" error. If I comment out the _mm_prefetch I'm getting an "internal error: 04010002_1809". However if I compile this as a native application using "icc test.cpp -mmic" everything compiles fine. Also if I move the call to _mm_clevict or _mm_prefetch into the offload region in the main function everything compiles fine. How do I use these intrinsics correctly? My compiler version is: icc version 15.0.0 (gcc version 4.4.7 compatibility)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I reproduced the internal error and will report that to Development (see internal tracking id below) since that should not happen regardless of the program correctness.
These intrinsics require using the __MIC__ predefine so they only participate in the target-side compilation for the offloaded code. Without this they participate in both the host and target compilations where they will be problematic on the host. So consider something like:
#ifdef __MIC__ _mm_clevict(0, _MM_HINT_T0); _mm_prefetch(0, _MM_HINT_ET1); #endif
(Internal tracking id: DPD200365512)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually, it looks like the ICE is caused by not defining __MIC__, so perhaps that is a workaround.
I made some slight modifications -- this works for offload or native:
[U538762]$ cat U538762-MIC.cpp
#include <omp.h>
#pragma offload_attribute(push, target(mic))
int global_nt;
#include "immintrin.h"
#include <stdio.h>
void foo()
{
#ifdef __MIC__
_mm_clevict(0, _MM_HINT_T0);
_mm_prefetch(0, _MM_HINT_ET1);
#pragma omp parallel
global_nt = omp_get_num_threads();
printf("\n Number of threads = % d \n",global_nt);
#endif
}
#pragma offload_attribute(pop)
int main()
{
#pragma offload target(mic:0)
{
foo();
}
}
[U538762]$
***OFFLOAD MODE****
[U538762]$ icc -V
Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.1.133 Build 20141023
Copyright (C) 1985-2014 Intel Corporation. All rights reserved.
[U538762]$ icc -openmp U538762-MIC.cpp -o U538762-MIC-host.x
[U538762]$ ./U538762-MIC-host.x
Number of threads = 224
[U538762]$
***NATIVE MODE****
[U538762]$ icc -openmp -mmic -wd161 U538762-MIC.cpp -o U538762-MIC-mic.x
[U538762]$
[root@dpdmic09-mic0 pbkenned]# ./U538762-MIC-mic.x
Number of threads = 228
[root@dpdmic09-mic0 pbkenned]#
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
thanks for your replies. Soon after posting my question I also realized that the errors were caused by the host compiler, so using #ifdef __MIC__ solves the problem. However I'm still wondering why this wasn't a problem with other intrinsics. Also the internal compiler error only occurs with this specific hint. Using _MM_HINT_ET0, _MM_HINT_T0 and _MM_HINT_T1 as hint for _mm_prefetch is also compiling fine. But I'll just use #ifdef __MIC__ in the future when working with MIC specific intrinsics.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page