- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I read the IA-32/64 developer's handbook, so far there isnt anything even at the machine code level, that can count the total numbers of setted bits for a given bytes or word without using loops, am I right about this?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can try using popcnt instruction, if supported by your target CPU.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sergey Kostrov wrote:
>>I read the IA-32/64 developer's handbook, so far there isnt anything even at the machine code level, that can count the total
>>numbers of setted bits for a given bytes or word without using loops, am I right about this?If you need a very fast algorithm to count the total number of bits which are set to 1 in a byte or in a word two look-up tables have to be used:
- 256 elemets for bytes
- 65536 elements for words ( I assume that sizeof( word ) is 2 bytes )
Very interesting way to do bit-wise operation, I will definitly give it a try, btw, have you benchmark this algorthim against traditional loop test?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you do this for words, you might also use a Look-up Table for one byte with 256 entries.
Then you go in the table with every byte of the word and sum the values up.
I am not sure, but in my opinion it might be faster than a larger table. The small table might be kept in the changes very near to the CPU. And selecting bytes of a word is a simple AND operation.
But this is only an assumption.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Some more ways: en.wikipedia.org/wiki/Hamming_weight
It is easy to modify code shown there for 8,16,32 bit variables in a traditional c language and for 128, 256 bits using sse/avx registers/operations.
Which way (_mm_popcnt_uXX, byte/word-table lookup, simple arithmetics, arithmetics with multiple, etc) is faster (and, in general, is available) depends on your application and target system(s).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page