- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have reduced my problem to this test:
#include <immintrin.h> int main() { int i; float tmp[16]; for(i=0; i<16; i++){ tmp = 5.0f; printf("%f ", tmp); } printf("\n"); __m512 __vtmp = _mm512_set1_ps(10.0f); __mmask16 mask = 0x0040; _mm512_mask_extpackstorelo_ps(&tmp, mask, __vtmp, _MM_DOWNCONV_PS_NONE, 0); for(i=0; i<16; i++){ printf("%f ", tmp); } printf("\n"); }
According to the description of the ISA manual, using the 0x0040, the first position of 'tmp' shouldn't be written. However, the output of this code is:
5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000
10.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000
Having a look at the assembly, for any reason, the 0x0040 is being translated to $1:
stmxcsr 64(%rsp) #4.1 c1 movl $1, %eax #11.23 c2 vprefetche0 (%rsp) #10.9 c2 orl $32832, 64(%rsp) #4.1 c6 kmov %eax, %k1 #11.23 c6 ldmxcsr 64(%rsp) #4.1 c10 vbroadcastsd .L_2il0floatpacket.1(%rip), %zmm0{%k1} #11.23 c11 xorl %ecx, %ecx #8.5 c15 movl $1084227584, %edx #10.9 c15 xorl %r12d, %r12d #8.5 c19 vpackstorelpd %zmm0, 72(%rsp){%k1} #11.23 c19 movl %edx, %ebx #11.23 c23 movq %rcx, %r15
Am I missing something?
I'm using icc (ICC) 14.0.2 20140120
Thank you
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, I missunderstood the meaning of the instruction described in the Intrinsic Guide.
It's much clearer in the Reference Manual.
Could someone delete this post, please?

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page