- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A presentation by M. Baer (DKFZ) and M. Kachelrieß (FAU) talks about 16-bit floating point format on Intel Xeon Phi coprocessors. I could not find references to that in the System Software Dev Guide or the Instruction Reference Manual. Where can I find informations about the 16-bit floating point instructions?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Interesting presentation, but I don't see anything in it about 16-bit floating point formats. It concentrates on the 512-bit wide (16 32-bit flats) short vector format.
Ivy Bridge host has load and store instructions which move data between 16-bit storage and 32-bit register format.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On slide 6, they say "Xeon Phi supports the 16 bit floating point format (half)", and then on slides 15 and 16 they report benchmarks on Xeon Phi with "floats" (I presume, 32-bit) and with "halfs", of which the latter is faster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From their diagram on Slide 6 and their earlier paper, it seems that they store information as 16-bit floats to reduce the data size, but crunch the vector arithmetics on Xeon Phi by converting 16-bit numbers to 32-bit floats. It is understandable if they have their own routine for doing the type conversion. However, I wonder if the statement "Xeon Phi supports the 16 bit floating point format" refers to some undocumented features of the MIC architecture.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually it IS documented, though I've never tried to use it. You've probably already grabbed a copy of the instruction set reference manual, https://software.intel.com/sites/default/files/forum/278102/327364001en.pdf. If you look at Table 2.4: 32 bit Floating-point Load-op SwizzUpConv, you'll see that one of the modes provided, 011, float16 to float32. In fact, that section of the document begins with this:
Data Conversions: Sources from memory can be converted to either 32 bit signed or unsigned integer or
32 bit ϐloating-point before being used. Supported data types in memory are float16, sint8, uint8, sint16,
and uint16 for load-op instructions
Though I've never tried it, apparently the authors of the paper you cite have had some success, as you say in reducing memory pressure where data volumes are high but resolution requirements are low.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ah, that's it! Thank you, Robert! After your comment I was able to locate the corresponding intrinsics (gather and scatter). With the parameter _MM_UP(DOWN)CONV_PS_FLOAT16 one can load (store) float16 data in memory to (from) vector registers as float32 data. This can be very handy!
It looks like there is no support for 16-bit floats in the Intel C/C++ compiler, so I do not suppose that automatic vectorization can handle 16-bit floats. Intrinsics or inline assembly look like the only way to go.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page