Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

GHui

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-14-2017
03:24 AM

360 Views

What is int8 and FP16?

I heard that int8 and FP16 from someone, but I don't know what it is.

Link Copied

4 Replies

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-16-2017
04:46 AM

360 Views

GHui

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-19-2017
07:38 PM

360 Views

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-20-2017
05:55 PM

360 Views

McCalpinJohn

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-21-2017
08:01 AM

360 Views

SIMD operations on int8 (byte) variables are supported by MMX, SSE2, AVX, AVX2, and AVX512BW (not shipping yet).

There is pretty good support for addition/subtraction on packed byte operands:

- unsigned add/subtract with wraparound,
- signed add/subtract with saturation, and
- unsigned add/subtract with saturation.

Bitwise logical operations don't require special versions for byte variables -- you just need to pick a SIMD boolean operation with the right register size. The same applies for loads and stores, of course.

Boolean operations (e.g., MIN/MAX) are supported for vectors of byte variables by SSE, SSE4_1, AVX2, and AVX512BW, while the bytewise SIMD "compare" operations (e.g., compare for equal, compare for greater than) are supported by MMX, SSE2, AVX, AVX2, and AVX512BW. There are additional AVX512BW instructions relating to converting the output of compare instructions between bit mask and SIMD register formats.

Shuffle operations on byte variables are supported by SSSE3, AVX, AVX2, and AVX512BW.

Blend operations on byte variables are supported by SSE4_1, AVX, and AVX2. The special cases of selecting the maximum or minimum byte values in each position of two SIMD values are supported by SSE (unsigned only), SSE4_1, AVX, AVX2, and AVX512BW.

Support for multiplication is trickier, since multiplication of two 1-byte variables produces a 2-byte result. There is a general instruction to multiply and add vectors of signed and unsigned bytes, truncated the result to a vector of sign-saturated bytes. This is supported in SSSE3, AVX, AVX2, and AVX512BW. There is also a specialized instruction to compute the (rounded) average of the corresponding unsigned bytes in two SIMD registers (SSE, SSE2, AVX, AVX2, AVX512BW).

There are a number of specialized operations available for SIMD vectors of byte variables as well. Some examples include:

- PSIGNB -- changes sign of destination byte if source byte is negative, zeros destination byte if source byte is zero. (SSSE3, AVX, AVX2)
- PABSB -- returns absolute value of each (signed) input byte in SIMD register. (SSSE3, AVX, AVX2, AVX512BW)
- PSADBW -- computes differences of unsigned bytes in two SIMD registers, then horizontally adds the absolute values of those differences, returning a single 16-bit result. (SSE, SSE2, AVX, AVX2, AVX512BW)

My mind boggles at the number of transistors that are required to implement these infrequently-used instructions, but that is part of what makes this field continually challenging....

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.