- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've hit a strange problem -- MIC is working otherwise so far -- I've been doing most operations in 16 bits integer (read in 16 bits, vec operation in 32 bits, write back 16 bits). But I need to do some operations in 32 bits (write back 32 bits vectors), somehow the result does not come out correctly. Am I doing something wrong, or is it a hardware problem?
I've created a simple code to show the problem:void _m512i_vec_dump( __m512i vec )
{ int vec_dump[16] __attribute__((aligned(64)));
//*((__m512i *) vec_dump) = vec; _mm512_store_epi32(vec_dump, vec);
printf( "%08hx %08x %08hx %08hx %08hx %08x %08hx %08hx\n", vec_dump[0], vec_dump[1], vec_dump[2], vec_dump[3], vec_dump[4], vec_dump[5], vec_dump[6], vec_dump[7] ); printf( "%08hx %08x %08hx %08hx %08hx %08x %08hx %08hx\n\n", vec_dump[8], vec_dump[9], vec_dump[10], vec_dump[11], vec_dump[12], vec_dump[13], vec_dump[14], vec_dump[15] );}
int main(){ __m512i vec; unsigned short test[128] __attribute__((aligned(64)));
for ( int i = 0; i < 128; i++ ) { test = -i-1; }
vec = _mm512_extload_epi32( &test[ 0 ], _MM_UPCONV_EPI32_SINT16, _MM_BROADCAST32_NONE, _MM_HINT_NONE );
_m512i_vec_dump(vec);
}
The output is this :
0000ffff fffffffe 0000fffd 0000fffc 0000fffb fffffffa 0000fff9 0000fff8
0000fff7 fffffff6 0000fff5 0000fff4 0000fff3 fffffff2 0000fff1 0000fff0
Note the upper bits -- they should ALL be ffffffff.
I am using icc version 13.1.3.
So what I am doing wrong, or does anyone else have this issue?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The formatting of the code is screwed up -- reformatting for better legibility.
By the way -- anyone know what's the gcc inline asm constraint for the mic vectors? Someone mentioned the constraint for the mask registers is 'k', but I could not find the constraint for the vector registers in the documents. I was trying to substitute intrinsics with inline asm to see whether there is a bug in the intrinsics implementation.
James C. wrote:
'void _m512i_vec_dump( __m512i vec ){
int vec_dump[16] __attribute__((aligned(64)));
//*((__m512i *) vec_dump) = vec;
_mm512_store_epi32(vec_dump, vec);
printf( "%08hx %08x %08hx %08hx %08hx %08x %08hx %08hx\n", vec_dump[0], vec_dump[1], vec_dump[2],
vec_dump[3], vec_dump[4], vec_dump[5], vec_dump[6], vec_dump[7] );
printf( "%08hx %08x %08hx %08hx %08hx %08x %08hx %08hx\n\n", vec_dump[8], vec_dump[9],
vec_dump[10], vec_dump[11], vec_dump[12], vec_dump[13], vec_dump[14], vec_dump[15] )
;}
int main(){
__m512i vec;
unsigned short test[128] __attribute__((aligned(64)));
for ( int i = 0; i < 128; i++ ) { test = -i-1; }
vec = _mm512_extload_epi32( &test[ 0 ], _MM_UPCONV_EPI32_SINT16, _MM_BROADCAST32_NONE, _MM_HINT_NONE );
_m512i_vec_dump(vec);
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I inquired with our intrinsic expert on the output and constraints you inquired about. He wrote:
There is nothing wrong with _mm512_store_epi32, and everything is stored correctly. This is a user misunderstanding – the reason is that ‘h’ modifier is being used in print format:
printf( "%08hx %08x %08hx %08hx %08hx %08x %08hx %08hx\n", vec_dump[0], vec_dump[1], vec_dump[2],vec_dump[3], vec_dump[4], vec_dump[5], vec_dump[6], vec_dump[7] );which causes conversion to ‘short’ before printing. Replacing %08hx with %0x8x will print correct results.Constraint for 512-bit registers is ‘v’.
Constraint for any mask register (including ‘k0’) is ‘k’.
And constraint for a writemask register (which excludes ‘k0’) is ‘Yk’.For example:
void foo(__m512 v1, __m512 v2, __m512 v3, __mmask k1) {
__asm("vmulps %2, %1, %0 {{%3}}" : "=v" (v1): "v"(v2), "v"(v3), "Yk"(k1));
}
Hope that helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are right. That was embrassing. Thanks.
By the way what is the vector register contraint code for gcc style inline asm?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The constraint for the 512-bit vector register is "v". See the Developer's reply for the details on this and the other constraints.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page