- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I noticed a recent commit to LLVM which mentions that Tremont supports GFNI without AVX-512. Based on the patch, it looks like different functions require:
- GFNI alone (since SSE2 is part of the x86_64 baseline)
- GFNI + AVX
- GFNI + AVX-512BW
- GFNI + AVX-512BW + AVX-512VL
However, the intrinsics guide shows all the 128/256-bit functions as requiring AVX-512VL. I assume this will be updated eventually now that GFNI is being split up, but how? I was hoping to use LLVM as a reference, but they use macros for some of the functions and those don't have the attributes which tell which ISA extensions are required.
It makes sense to me that that 128-bit functions would require GFNI alone, 256-bit require GFNI and AVX, and 512-bit require GFNI and AVX-512BW, but that fourth category (GFNI + AVX-512BW + AVX-512VL) is confusing me…
I'd like to tweak some of my code so I can correctly detect which group of GFNI functions is available… does anyone have any insight into exactly which functions requite which ISA extensions?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can see GFNI instruction encodings in the Software Developer's Manual. There are three versions of encodings:
- Legacy SSE with 0x66 prefix. Only supports 128-bit vectors xmm0-xmm15.
- VEX-encoded, which is compatible with AVX/AVX2. Supports 128 and 256-bit vectors x/ymm0-x/ymm15.
- EVEX-encoded, which is compatible with AVX-512. Supports 128, 256 and 512-bit vectors x/y/zmm0-x/y/zmm31.
In the AVX-512 case, in order to have support for 128 and 256-bit vectors, AVX-512VL is required. The usual difference between SSE and AVX instructions also apply - SSE instructions don't zero the upper bits of the output vector registers.
SDM also describes the CPUID features that are required for each of the encodings to be supported:
- GFNI alone for SSE encoding
- AVX+GFNI for VEX encoding
- AVX-512F+GFNI for EVEX encoding and additionally AVX-512VL for 128 and 256-bit vectors.
As to which encodings are used for intrinsics, it is the compiler's decision. I believe, the compiler selects the encoding based on the target ISA, as specified in the command line or attributes applied to the function being compiled.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can see GFNI instruction encodings in the Software Developer's Manual. There are three versions of encodings:
- Legacy SSE with 0x66 prefix. Only supports 128-bit vectors xmm0-xmm15.
- VEX-encoded, which is compatible with AVX/AVX2. Supports 128 and 256-bit vectors x/ymm0-x/ymm15.
- EVEX-encoded, which is compatible with AVX-512. Supports 128, 256 and 512-bit vectors x/y/zmm0-x/y/zmm31.
In the AVX-512 case, in order to have support for 128 and 256-bit vectors, AVX-512VL is required. The usual difference between SSE and AVX instructions also apply - SSE instructions don't zero the upper bits of the output vector registers.
SDM also describes the CPUID features that are required for each of the encodings to be supported:
- GFNI alone for SSE encoding
- AVX+GFNI for VEX encoding
- AVX-512F+GFNI for EVEX encoding and additionally AVX-512VL for 128 and 256-bit vectors.
As to which encodings are used for intrinsics, it is the compiler's decision. I believe, the compiler selects the encoding based on the target ISA, as specified in the command line or attributes applied to the function being compiled.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page