- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ravi,
Download:
https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
and find the intrinsic (Ctrl-F3).
_mm256_shuffle_epi8() does high order 128-bit permutation using high order 128-bit of all parameters.
The method is same as for low 128-bit.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ravi,
I'm using a code like this:
// "sign" allows to compare unsigned numbers with _mm256_cmpgt_epi32
// after _mm256_shuffle_epi8 we have 15, 14,...0, 31, 30,...16
// after_mm256_permute2f128_si256 we have 31, 30,...16, 15, 14,...0
__m256i ff = _mm256_set1_epi32(-1);
__m256i idx = _mm256_setr_epi8(
15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0,
15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0);
__m256i sign = _mm256_set1_epi32(0x80000000);
__m256i v0, v1;
__m256i eq, gt0, gt1;
v0 = _mm256_loadu_si256((__m256i *)a);
v1 = _mm256_loadu_si256((__m256i *)b);
eq = _mm256_cmpeq_epi32(v0, v1);
if (!_mm256_testc_si256(eq, ff)) //not equal
{
v0 = _mm256_shuffle_epi8(v0, idx);
v1 = _mm256_shuffle_epi8(v1, idx);
v0 = _mm256_xor_si256(v0, sign);
v1 = _mm256_xor_si256(v1, sign);
v0 = _mm256_permute2f128_si256(v0, v0, 0x01);
v1 = _mm256_permute2f128_si256(v1, v1, 0x01);
gt0 = _mm256_cmpgt_epi32(v0, v1);
gt1 = _mm256_cmpgt_epi32(v1, v0);
return _mm256_movemask_ps(_mm256_castsi256_ps(gt0)) - _mm256_movemask_ps(_mm256_castsi256_ps(gt1));
}

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page