Solved: How is the sign bit represented in memory and in the CPU?

Marcus_J_ · ‎10-04-2016

I'm trying to use bitfields in my software, and I heard that intel store's it's sign bit in bit 7 (aka the 8th bit) regardless of the size of the integer (so uint8_t and uint64_t would both store their sign bit in bit 7), and I was just wondering if that is true, and if it's for x86 in general, or Intel specifically, or what?

Also, I've heard that data is stored in memory in little endian, but is flipped to big endian inside the CPU, and I'm just wondering if that's true, and if I need to do anything in particular to make sure that works?

andysem · ‎10-04-2016

This question doesn't really belong to this forum, you should really ask things like this on StackOverflow or something.

Like many other modern architectures, x86 uses two's complement (https://en.wikipedia.org/wiki/Two%27s_complement) representation of signed integers. This makes the most significant bit of the integer represent its sign (0 for positive numbers, 1 for negative). Note that that means 8-th bit in an 8-bit integer and 16-th bit in a 16-bit integer and so on. Also note that uint8_t and uint64_t are unsigned integers, i.e. they are always positive and the most significant bit contributes to the integer value and does not indicate its sign. But you can memcpy a signed integer onto a same sized unsigned integer and observe its bits - in this case you can see the sign bit in the most significant bit of the unsigned integer.

Regarding endianness, it describes the way numbers are stored in memory. You have to take measures if a number can (or must) be stored differently than your CPU expects - this means you have to perform endianness conversion when importing or exporting data.

However, the program cannot observe how the CPU stores the integers in its registers because you cannot address individual bytes comprising the registers. You can access portions of some registers (e.g. al, ah, ax and eax are all parts of rax), but that doesn't really indicate the relative positions of these portions with respect to each other. For instance, you have no way to extract the 1-st byte of rax, which contains the value of 0x0001020304050607, and see if it's 0x00 or 0x07 or something else or is it equal to al or ah. So for all purposes, when you talk of registers and values they store, you mean either numbers (in mathematical sense) or bit fields, regardless of endianness. When you talk bit fields, there is a contiguous array of them, from least to most significant one. You can't tell how these bits are physically located because any instructions like shifts, logical operations or moves interpret registers as such arrays.

View solution in original post

andysem · ‎10-04-2016

This question doesn't really belong to this forum, you should really ask things like this on StackOverflow or something.

Like many other modern architectures, x86 uses two's complement (https://en.wikipedia.org/wiki/Two%27s_complement) representation of signed integers. This makes the most significant bit of the integer represent its sign (0 for positive numbers, 1 for negative). Note that that means 8-th bit in an 8-bit integer and 16-th bit in a 16-bit integer and so on. Also note that uint8_t and uint64_t are unsigned integers, i.e. they are always positive and the most significant bit contributes to the integer value and does not indicate its sign. But you can memcpy a signed integer onto a same sized unsigned integer and observe its bits - in this case you can see the sign bit in the most significant bit of the unsigned integer.

Regarding endianness, it describes the way numbers are stored in memory. You have to take measures if a number can (or must) be stored differently than your CPU expects - this means you have to perform endianness conversion when importing or exporting data.

However, the program cannot observe how the CPU stores the integers in its registers because you cannot address individual bytes comprising the registers. You can access portions of some registers (e.g. al, ah, ax and eax are all parts of rax), but that doesn't really indicate the relative positions of these portions with respect to each other. For instance, you have no way to extract the 1-st byte of rax, which contains the value of 0x0001020304050607, and see if it's 0x00 or 0x07 or something else or is it equal to al or ah. So for all purposes, when you talk of registers and values they store, you mean either numbers (in mathematical sense) or bit fields, regardless of endianness. When you talk bit fields, there is a contiguous array of them, from least to most significant one. You can't tell how these bits are physically located because any instructions like shifts, logical operations or moves interpret registers as such arrays.

Marcus_J_ · ‎10-05-2016

Thanks, fantastic answer.

and yeah I messed up on the uintX part haha, I normally use unsigned ints.