Multiplication Rounding

Altera_Forum · ‎01-14-2012

Hey guys,

I am multiplying two 32 bit numbers using LPM_MULT. Each number is in Q16.16 fixed point format. The product is a 64 bit Q32.32 number. I then shift the product to the right by 16 bits to get a Q32.16 bit number. Now, I would like to round the number to get a Q16.16 bit number. Large inputs are possible, so the product needs to be rounded.

Does anyone know a simple and effective algorithm that could be implemented in Verilog? In my old rounding system I would simply divide by 65536. Since I am using Q16.16 format, I don't think it will work anymore.

Altera_Forum · ‎01-14-2012

Most basic rounding is:

if your result is 64 bits then add result(31) to result(61 : 32)

Alternatively add '1' to result(61:31) then take result(61:32)

Thats all.

Altera_Forum · ‎01-14-2012

Thanks for the reply Kaz. My concern was that the fractional part would cause the product to be huge. For example, if you have 111.010=7.25, it can be interpreted as 111010=58. Performing a rounding on a rather small number like 7.25 with a fraction might be the same as performing the rounding on a large number. Basically, I'm afraid that such a rounding algorithm would cause small numbers to become rounded as large numbers.

Another concern is whether I should even shift at all after multiplication. Anytime you multiply a number you get 2*N decimal places, where N is the number of decimal places in the inputs. Since I only want N precision, I would the product by the right by N bits to discard them. With the method you gave, do I even need to do this at all? Would it be better just to round only?

I know this question may seem kind of stupid, but I'm still learing about fixed point for the first time.

Altera_Forum · ‎01-14-2012

It is much simpler than your thoughts.

Rounding is done when you divide a value and in your case when you discard 32 bits. (= /2^32).

you can either discard directly without any else since your datawidth is 32 bits. or you round up to nearest value.

The issues of deimal point is matter of interpretation of value, nothing else.

Altera_Forum · ‎01-14-2012

Thanks for your help Kaz. Added to your reputation.

Altera_Forum · ‎01-14-2012

Regarding shift idea, you don't need to think that way, just discard 32 LSBs. shift is applicable if you keep 64 bits and want to shift divide or shift multiply.

Altera_Forum · ‎01-14-2012

Just some clarification,

Lets say you have Q2.2 notation. You have two numbers:

01.11=1.75

x 10.00=2.00

The product should be 3.50

If I do the binary multiplication in my calculator I get a Q4.4 result:

00111000

Now I perform the rounding by adding P(3) to P(7:4) I get:

0011+0001=0100

If I'm interpreting this correctly, it appears that my 3.5 rounded to a 4. But, there isn't any fractional part of the number. The rounding forced the result to Q4.0. Instead of getting 11.10=3.5, I got 0100=4.

Now if I didn't do any rounding at all and just performed a shift to the right to align the decimal point, I would get:

00001110 = 000011.10 = 3.5

Maybe I'm confusing the concept of rounding. I want to get a rounded value, but I still want to have a fractional value. If I perform this rounding without the shift, then the output 0100 will go throughout the rest of the system and get interpreted as 01.00=1, since the decimal point is aligned throughout the rest of the system.

Altera_Forum · ‎01-14-2012

Firstly, the interpretation of a value on 64 bits is only a mental perspective. In the past most fpga engineers used the perspective of integer interpretation. Today and as result of software mindset invasion through their tools we have at times to shift between both concepts(integer or fractional).

if you discard 32 lsbs off 64 bits then in effect you are dividing the whole result by 2^32. Rounding will give nearest value. You don't round in the middle of bits at the imagined fractional point. In short you better think of 64 bits here as integer only(no decimal point). You round the 32 bits and it is the duty of receiving module to interpret the decimal point.

if your value is so small (relatively) that it only occupies 32 LSBs then you are going to get either 0 or 1 effect on the 32 MSBs. You then interpret that as 0.0 or 0.1/2^16 in Q16.16 format if next module is going to look at it that way.

Altera_Forum · ‎01-14-2012

The receiving circuitry, in this case an adder and MAC, considers the 16 bits to the right as part of the fraction.

Going back to the Q2.2 example, you are saying that the result 3.5 is so small that it gets rounded to 1.0?

Altera_Forum · ‎01-14-2012

indeed, your value 0100 is not 4 in Q2.2 format but is 1.0 = 1 and is result of 3.5/4 or 56(111000)/2^4 depending on interpretation

Altera_Forum · ‎01-14-2012

--- Quote Start ---

Lets say you have Q2.2 notation. You have two numbers:

01.11=1.75

x 10.00=2.00

The product should be 3.50

If I do the binary multiplication in my calculator I get a Q4.4 result:

00111000

--- Quote End ---

So far, soo good.

--- Quote Start ---

Now I perform the rounding by adding P(3) to P(7:4) I get:

0011+0001=0100

If I'm interpreting this correctly, it appears that my 3.5 rounded to a 4. But, there isn't any fractional part of the number. The rounding forced the result to Q4.0. Instead of getting 11.10=3.5, I got 0100=4.

--- Quote End ---

Your mistake is not in the rounding part, it's in the cutting part.

You had a Q4.4 value and trimmed it's 4 LSBs. Thus, you got a Q4.0 value. Q4.0 can only represent 3 or 4, not 3.5.

The rounding is actually correct for a conversion to Q4.0: it rounded 3.5 to 4.

Altera_Forum · ‎01-14-2012

This thread interestingly raised a sensible issue regarding meaning of words we use.

Indeed rounding implies immediately the rounding of fraction to integer.

But with fractional notation of data buses it is quite misleading. One has to use integer view here and only use fractional view in tools like ip generators, dsp builder. It is really invasion by software mindset and very unfair...

Altera_Forum · ‎01-14-2012

My question probably did involve some confusing use of terminology. I'm just a novice, so I should be more careful from now on.

Anyways, lets say you have two Q2.2 numbers. One is 11.11( and the other 11.11. These are the largest values that can be represented in Q2.2. Now, if we multiply them, we get:

11100001 = 14.0625.

Now, this is Q4.4. I need Q2.2. You can't represent 14 with two bits. So there will be some wrap around. That is why I need rounding. Essentially, the desired product would be some smaller wrap around number, and a fractional portion of two bits.

Altera_Forum · ‎01-14-2012

I'm afraid you can't have your cake and eat it too.

If you need to represent numbers into the range of 14, you need to either

a) use Q4.2, 6 bits total

b) use Q4.0, 4 bits total but no fractional part

Altera_Forum · ‎01-14-2012

--- Quote Start ---

Now, this is Q4.4. I need Q2.2. You can't represent 14 with two bits. So there will be some wrap around. That is why I need rounding. Essentially, the desired product would be some smaller wrap around number, and a fractional portion of two bits.

--- Quote End ---

It's unclear, what you want to achieve.

As a first point, the term rounding isn't commonly used for the handling of overflow problems. Basically there are two options:

- omit/cut the high bits, which causes wrap around, in case of signed numbers often resulting in a wrong sign

- apply saturation logic, limiting the output to the highest respectively lowest value that can be represented in the number system. This technique is mostly applied in digital signal processing

Altera_Forum · ‎01-14-2012

Some engineers use the following notation for bit growth issues:


             
         ==1==>
         |
   ==32======16==>
         |
         ==15==>

i.e. of 32 bits data bus: I discard 15 LSBs, discard 1 MSB, pass 16 bits

the top part needs saturation as FvM indicated (a must unless you don't expect values to get there). it is symmetric or not.

The bottom part means divide by 2^15 and may be truncated directly or rounded. Rounding can be to nearest integer, or basic or unbiased or towards positive inf. or towards negative inf. or towards roof ...etc plenty of methods here but differences are trivial.

The rule applies to any data signal irrespective whether you view it as integer(decimal point at start!!) or fractional(decimal point anywhere you imagine)

Altera_Forum · ‎01-14-2012

This is what I did, and it works great:

Drop the MSB (sign bit) of the product, and grab the next lower N bits.

Add the (N-2)th bit to these N bits. The result is the rounded number in the proper Q notation.

For example, lets say you have 011.01(3.25) * 011.11(3.75). Using a calculator you get 001100.0011. This can be thought of as 13x15=195.

If you do the rounding you get:

01100 + 00000 = 01100.

Now if you consider it as decimal Q2, it is 011.00= 3.00

3.25x3.75=12.1875

Divide this by 4 and you get 3.04, which is pretty close to the rounded result of 3.00.

Integers and fixed point are the same except being multiples of each other. 3.75 is the same as 15, except for a division by 4.

If you multiply integers together, you get a result which can then be rounded via the algorithm above, which is basically a division by 16.

Then, if you consider both the inputs and product as fixed point Q2, then all of them are divided by 4. If you divide both sides of the same equation by the same number, nothing changes.

The result of the integer multiplication is 195. The rounding divides it by 16 to yield 12.1875. The division by 4 yields 3.04.

Essentially, as some people pointed out, the rounding works regardless of the number of fractional points.

Altera_Forum · ‎01-15-2012

but be careful with 2's complement. MSB is sign bit and the one next to it may differ. Moreover apply saturation after you add the 1 of rounding in case rounding leads to overflow.