topic Re: Optimizing this bit manipulation code in Software Tuning, Performance Optimization & Platform Monitoring

Optimizing this bit manipulation code

CommanderLake — Sat, 03 Sep 2022 17:04:15 GMT

So far I have only figured out how to do this with the .NET class BitArray, its virtually impossible to explain with words so here's the C# code:

for(int i = 0, j = 0, k = 0; i < bitArrayIn.Length; ++i, j += 16){ if(j >= bitArrayIn.Length) j = ++k; bitArrayOut[i] = bitArrayIn[j]; }

and to reverse that operation:

var stride = bitArrayIn.Length/16; for(int i = 0, j = 0, k = 0; i < bitArrayIn.Length; ++i, j += stride){ if(j >= bitArrayIn.Length) j = ++k; bitArrayOut[i] = bitArrayIn[j]; }

If this makes sense to anyone my goal is to convert this to native code and optimize it is that's even possible but I can't get my head around how to do it or how to optimize it and make it faster.

Re: Optimizing this bit manipulation code

CommanderLake — Mon, 05 Sep 2022 23:33:08 GMT

Here it is in C++, can this be optimized? Are there intrinsics or any other methods of doing this that can help speed it up?:

typedef unsigned char T; constexpr int TS = sizeof(T)*8; bool Get(const T* a, const long long index){ return (a[index/TS] & 1 << static_cast<long>(index % TS)) != 0; } void Set(T* a, const long long index, const bool value){ if(value) a[index/TS] |= static_cast<unsigned char>(1 << static_cast<long>(index % TS)); else a[index/TS] &= static_cast<unsigned char>(~(1 << static_cast<long>(index % TS))); } extern "C" _declspec(dllexport) void a(const T* a, T* b, const long long size, const long step){ auto ii = 0LL; const auto bitn = size*8; for(long long i = 0, j = 0, k = 0; i < bitn; ++i, j += step){ if(j >= bitn){ j = ++k; } Set(b, ii++, Get(a, j)); } }

Re: Optimizing this bit manipulation code

CommanderLake — Tue, 06 Sep 2022 20:53:54 GMT

At least tell my why I'm not getting any replies I thought this forum was full of experts with this kind of stuff?

I'll try explaining further...

I want to separate the bits of a 16 bit integer array so the most significant bit of each int16 are arranged in one contiguous block followed by the next bit from each int16 in a block after that and so on, you might call this unpacking the bits? This is a sample of how the layout of the data would change:

0101010101010101 0101010101010101
would become
0011001100110011 0011001100110011
or with 3 it would look like
0101010101010101 0101010101010101 0101010101010101
which would become
0001110001110001 1100011100011100 0111000111000111
and so on

Getting and setting individual bits is incredibly inefficient so I'm looking for ways to speed up this particular rearrangement of the bits.

Re: Optimizing this bit manipulation code

CommanderLake — Wed, 07 Sep 2022 20:38:42 GMT

Is anyone there? is this forum dead?