- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @ all,
I am currently working with AVX-512 on a KNL. I wanted to test one the new conflict detection functions (_mm512_conflict_epi32). The result of this call is a zero-extended bitvector. To use this for further computations, I want to use a vectorized trailing zeros count but I only found a vectorized leading zeros count. _tzcnt_u32 (AVX2) and _mm_tzcnt_32 (AVX-512) are working with scalar types. Is it wright, that I have to perform the trailing zero count on a scalar level or does anyone know a vectorized way (maybe by swapping the endianess and perform a lzc afterwards)? Thanks for your effort!
Sincerely yours
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Trailing zero count can be computed from the leading zero count like this:
uint32_t tzcnt(uint32_t x) { if (x == 0u) return 32u; return 31u - _lzcnt_u32((x - 1u) ^ x); }
which you could convert to vector instructions.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Trailing zero count can be computed from the leading zero count like this:
uint32_t tzcnt(uint32_t x) { if (x == 0u) return 32u; return 31u - _lzcnt_u32((x - 1u) ^ x); }
which you could convert to vector instructions.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page