Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development Technologies
- Intel® ISA Extensions
- Quad precision ?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

tux456

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-28-2008
08:10 AM

70 Views

Quad precision ?

I am a bit lost. I try to find information on how the "quad-precision" (REAL*16) is implemented on new Intel CPU (like Xeon 54xx) and Intel 10.x compiler...

It is hardware-supported or only software supported (or a mix of the two)?

What is the accuracy (in digit) we can expect?

What kind of performance we can expect in comparison to typical a double-precision (Linpack for example) ?

How the Intel CPU compare in quad precision with the IBM POWER6 architecture?

For example, a quote from the POWER6 description:

"[On Power6 ... ]The unit is effectively quad precision, offering up to 36 digit accuracy in 144 bits, although results are compressed to 128 bits to fit in two floating point registers and then decompressed before consumption. Basic operations are somewhat slower than ALU operations, with single cycle throughput, but 2 cycle latency."

thanks,

tux456

Link Copied

2 Replies

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-28-2008
09:32 AM

70 Views

Quad precision would be implemented on Xeonby combinations of x87 "REAL*10" operations, so at least 2 non-vectorizable instructions would be required to implement each floating point operation. For most operations, you should get 48 bits additional beyond the x87 precision, thusabout 33 decimal.

Comparing linpack performance doesn't make much sense, except to emphasize that you require roughly 5 operations per floating point add and multiply, plus packing and unpacking time, as well as losing a factor of say 2 by no vectorization.

I haven't seen any documentation indicatingthat Power6 would have changed the floating point format from that which previous IBM and MIPS architectures used, which supports approximately 107 bits or 31 decimal, with exponent range reduced in comparison with REAL*8. In effect, 11 bits are wasted, due to carrying 2 copies of the exponent, differing by a constant. Of course, those implementations should penalize performance by only a factor of 3 or so, compared with non-vector REAL*8.

One of the design parameters for Itanium is full instruction level support for quad precision, possibly making it a superior platform for that purpose. Needless to say, that advantage hasn't proven decisive in the marketplace.

tux456

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-28-2008
12:54 PM

70 Views

It's unfortunate that we realise to late that the Itanium2 can be usefull!

For more complete information about compiler optimizations, see our Optimization Notice.