Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

Johannes_P_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-20-2017
07:22 AM

44 Views

Hi @ all,

I am currently working with AVX-512 on a KNL. I wanted to test one the new conflict detection functions (_mm512_conflict_epi32). The result of this call is a zero-extended bitvector. To use this for further computations, I want to use a vectorized trailing zeros count but I only found a vectorized leading zeros count. _tzcnt_u32 (AVX2) and _mm_tzcnt_32 (AVX-512) are working with scalar types. Is it wright, that I have to perform the trailing zero count on a scalar level or does anyone know a vectorized way (maybe by swapping the endianess and perform a lzc afterwards)? Thanks for your effort!

Sincerely yours

Accepted Solutions

Highlighted

andysem

New Contributor III

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-22-2017
09:43 AM

44 Views

Trailing zero count can be computed from the leading zero count like this:

uint32_t tzcnt(uint32_t x) { if (x == 0u) return 32u; return 31u - _lzcnt_u32((x - 1u) ^ x); }

which you could convert to vector instructions.

1 Reply

Highlighted

andysem

New Contributor III

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-22-2017
09:43 AM

45 Views

Trailing zero count can be computed from the leading zero count like this:

uint32_t tzcnt(uint32_t x) { if (x == 0u) return 32u; return 31u - _lzcnt_u32((x - 1u) ^ x); }

which you could convert to vector instructions.

For more complete information about compiler optimizations, see our Optimization Notice.