- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey all,
i'm currently trying to compile the following native Intel Intrinsics code.
for ( std::size_t k = 0; k < n; k += 16 ) {
x_ = _mm512_load_ps( x + k );
u_ = _mm512_load_ps( u + k );
xhd_ = _mm512_movehdup_ps( x_ );
upm_ = _mm512_permute_ps( u_, 177 );
hdpm_ = _mm512_mul_ps( xhd_, upm_ );
xld_ = _mm512_moveldup_ps( x_ );
cmplx_ = _mm512_fmaddsub_ps( xld_, u_, hdpm_ );
_mm512_store_ps( x + k , cmplx_ );
}
and thereby I get a compile error:
mic_test.cpp:(.text+0x1097): undefined reference to `_mm512_movehdup_ps'
mic_test.cpp:(.text+0x10b1): undefined reference to `_mm512_permute_ps'
mic_test.cpp:(.text+0x10ce): undefined reference to `_mm512_moveldup_ps'
mic_test.cpp:(.text+0x10e8): undefined reference to `_mm512_fmaddsub_round_ps'
I have included immintrin.h, the memory is 64 byte aligned and I checked
http://software.intel.com/en-us/comment/1726413#comment-1726413
But the Problem is that my program runs in native mode. I don't use pragmas for offloading the code to the MIC.
The strange thing is that a similar code:
for ( std::size_t k = 0; k < n; k += 16 ) {
x_ = _mm512_load_ps( x + k );
z_ = _mm512_load_ps( z + k );
u_ = _mm512_load_ps( u + k );
v_ = _mm512_load_ps( v + k );
zv_ = _mm512_mul_ps( z_, v_ );
real_ = _mm512_fmsub_ps( x_, u_, zv_);
zu_ = _mm512_mul_ps( z_, u_ );
imag_ = _mm512_fmadd_ps( x_, v_, zu_);
_mm512_store_ps( x + k , real_ );
_mm512_store_ps( z + k , imag_ );
}
compiles and runs without problems on the MIC.
Thanks,
Patrick
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The AVX 512 instructions and those particular unresolved intrinsics (discussed in the C++ UG here: http://software.intel.com/en-us/node/485150) are not currently available for Intel Xeon Phi™. They will be available with Knight’s Landing. There is a nice discussion in James’ blog: http://software.intel.com/en-us/blogs/2013/avx-512-instructions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is it documented which AVX512 Intrinsics will work on the current Xeon Phi? or do I have to test each Intrinsics function?
Can I use an inline assembly instruction e.g. vmovshdup instead of _mm512_movehdup_ps? Or did you mean that the whole AVX Instruction ist not available?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My understanding is that no AVX512 instrincs or instructions work on the current Xeon Phi coprocessor ("Knight's Corner"). Yes, Knights Corner has 512-bit operations, but they are not the same as the AVX512 operations. Sorry.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A lot of AVX512 Intrinsics work!
for example this code:
x_ = _mm512_load_ps( x + k );
z_ = _mm512_load_ps( z + k );
u_ = _mm512_load_ps( u + k );
v_ = _mm512_load_ps( v + k );
zv_ = _mm512_mul_ps( z_, v_ );
real_ = _mm512_fmsub_ps( x_, u_, zv_);
zu_ = _mm512_mul_ps( z_, u_ );
imag_ = _mm512_fmadd_ps( x_, v_, zu_);
_mm512_store_ps( x + k , real_ );
_mm512_store_ps( z + k , imag_ );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ah ok. Am I right that this link
http://software.intel.com/en-us/node/460862
is the list for all Intrinsics functions that are available for Knights Corner?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, you are right.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok, thanks :)

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page