Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- FPGAs and Programmable Solutions
- Programmable Devices
- Difference in floating point and fixed point arithmetic in vhdl.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Altera_Forum

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-07-2016
05:14 PM

1,690 Views

Difference in floating point and fixed point arithmetic in vhdl.

Hiii...

I want to know theLink Copied

4 Replies

Altera_Forum

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-07-2016
10:18 PM

219 Views

Hi,

let me first recall integer arithmetic: you have a predefined number of bits, which define the range in which you can do calculations; e.g. for 8 bit signed integer, you can have numbers in the range -128 to 127. In other words: if you want to do maths where the number 500 might occur, better use 16 bit numbers. In fixed point arithmetic you do the same, but you also define that a few of those bits represent a fractional part. Let's say you have an 8 bit number, of which 4 represent the integer part, and another 4 represent the digits behind the decimal separator. You could represent numbers like 3.25 (which is 0011b integer and 0100b decimals; why 0100b? 0.25 is not a half = 0, but a quarter = 1, and nothing else = 00). Actually, you can synthesize this stuff, e.g. using the "fixed" data type in VHDL 2008. So all you need to do is to define the range of your numbers, calculate how many bits you need, encode that, done. I think it's easier if I explain this in decimal number than in binary: if I want to do calculations where I expect numbers to go down to 0.001, and up to 1000, I must have at least 7 decimal digits (seven as in "0000.000"). That's around 21 bits in binary (turns out 20 bits are sufficient if you do the math in binary). Plus one more bit for a sign, okay. Sounds nice, right? Except when your mathematical problems are in a domain where both large and small numbers occur. In physics you often deal with ranges from 10^-15 up to 10^12 (I just made that up, but you know what I mean - there's nano-this, and giga-that). Does that mean you need 200 bits just to be sure you can cover each of those cases? No. Because you never to maths with more than, let's say, 3 digits. What do I mean with that? I typically use numbers like 3.14nano, or 3.14*10^-9. See what I did? I split that ugly number of 12 decimal digits (0.00000000314) into something I can represent with only 4 digits and a sign bit: 3.14 and -9. You can do the same in binary. It's called floating point, and it means that you use a fixed-point number (in my example it would represent the 3.14) and an integer number (-9 in my example). However, as you might imagine, working with floating-point is a bit more tricky (it's a fixed point plus an integer, and you need to calculate powers!). In other words: you'll need insane amounts of registers to do calculations with floating point. In my example above, I said I use four decimal digits, that's around 12 bit. Plus one for the sign bit, plus another sign for the power-of-10. That's 14 bit. In practice you normally use 32 bit numbers (as in the C-type "float"). Oops, I forgot: in my example, how do you represent the number 8192? That's 8,192*10^3, right? But I said you only have 3 digits for the number - okay, round that to 8,19. Now your "floating-point number" represents the value 8190. What can you do against that? Well, you could increase the number to, say 5 digits. (now we're around 18 binary digits) Still not clear? Here's it in a nutshell:- Integer: very simple to implement, but, well integers only
- Fixed point: you can have numbers which represent fractions and such, and they're relatively easy to implement in hardware (not much more complicated than integer), but you need to know your number range in advance
- Floating point: represent a large range of numbers, but very complicated in hardware.

Altera_Forum

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-07-2016
10:31 PM

219 Views

Fixed point is really just integer arithmetic with an offset (its not just similar - it is completely identical). It uses 2s compliment for signed arithmetic. https://en.wikipedia.org/wiki/two%27s_complement

Floating point uses a different number format complety. It uses a sign bit, and exponent and a mantissa. https://en.wikipedia.org/wiki/floating_point Fixed point has a fixed range with fixed precision based on the number of bits. Floating has a fixed range with floating precision, bit width is always fixed. Floating point is computationally expensive, uses a lot of resources and has a high latency. Fixed point is cheap, few resources and low latency.
Altera_Forum

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2016
06:38 AM

219 Views

To me the main issue is that floating point is used for critical few cases especially when small value representation is treated fairly.

The drawback of fixed point is the unfair representation of small values which will occupy few LSBs and waste many MSBs. For example with 8 bits unsigned the difference between 200 and 201 can be represented i.e. 1/200 is represented.But we can't pass same representation to lower values such as 1/200 of 3
Altera_Forum

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2016
08:59 AM

219 Views

For more complete information about compiler optimizations, see our Optimization Notice.