Software Archive
Read-only legacy content
17061 Discussions

How to use scatter/gather operations on MIC

shiva_rama_krishna_b
770 Views

Hi,

I want to use scatter/ gather operation on MIC. I could not find any example which shows the usage of these operations. I wrote one sample program which scatter an array. if comment "mm512_i32scatter_ps"  operation, program is executing with out any problem. if i use "mm512_i32scatter_ps" operation, I am getting error "offload error: process on the device 0 was terminated by signal 11 (SIGSEGV)". Please some help me out.

 

 

 

#include<stdio.h>
#include<stdlib.h>
#include<immintrin.h>

int main()
{
printf("CPU\n");

#pragma offload target(mic)
{
int* a = (int*)_mm_malloc(sizeof(int)*16,64);
for(int i=0; i<16;i++){
  a = i;

}
int* index = (int*)_mm_malloc(sizeof(int)*16,64);
for(int i=0; i<16;i++){
index = 15-i;

}
printf("xeonphi\n");
int* b = (int*)_mm_malloc(sizeof(int)*16,64);
__m512i index1 = _mm512_load_epi32((void*)index);
__m512 v1 = _mm512_load_ps((void*)a);
_mm512_i32scatter_ps((void*)b,index1,v1,1);
}
}

 

0 Kudos
1 Solution
Leonardo_B_Intel
Employee
770 Views

Hi Please try to define indexes that are aligned in the boundaries of the datatype in use. Note that the first argument in extern void __cdecl _mm512_i32scatter_ps(void* mv, __m512i index, __m512 v1, int scale) is byte aligned. So the “index” variable should take that into account. Try to change from index = 15-i; to index = (15-i)*sizeof(int); Best, Leo.

View solution in original post

0 Kudos
4 Replies
Leonardo_B_Intel
Employee
771 Views

Hi Please try to define indexes that are aligned in the boundaries of the datatype in use. Note that the first argument in extern void __cdecl _mm512_i32scatter_ps(void* mv, __m512i index, __m512 v1, int scale) is byte aligned. So the “index” variable should take that into account. Try to change from index = 15-i; to index = (15-i)*sizeof(int); Best, Leo.

0 Kudos
shiva_rama_krishna_b
770 Views

Thank you lee..

your suggestion is working.

 I actually want to scatter 1024 elements into two groups. even index elements will be in one group and odd index elements will be in other group. For instance if my input array is {0,1,2,3,4,5,6,7,8,9,10}. output should be {0,2,4,6,10,1,3,5,7,9} or two seperate arrays{0,2,4,6,8,10}, {1,3,5,7,9}.

one possible solution can be create an index array = {0,5,1,6,2,7,3,8,4,9,5,10}.

But here i am doing a strided access. So is there any intrensic which serves this purpose.

0 Kudos
shiva_rama_krishna_b
770 Views

Thank you lee..

your suggestion is working.

 I actually want to scatter 1024 elements into two groups. even index elements will be in one group and odd index elements will be in other group. For instance if my input array is {0,1,2,3,4,5,6,7,8,9,10}. output should be {0,2,4,6,10,1,3,5,7,9} or two seperate arrays{0,2,4,6,8,10}, {1,3,5,7,9}.

one possible solution can be create an index array = {0,5,1,6,2,7,3,8,4,9,5,10}.

But here i am doing a strided access. So is there any intrensic which serves this purpose.

0 Kudos
Leonardo_B_Intel
Employee
770 Views

Hello Shiva,

Have you tried to look either at the Permute instruction (_mm512_mask_permutevar_epi32) or the Swizzle ( _mm512_swizzle_*). With proper masks you should be able to get what you want.

Maybe load groups of 16 elements of your input array into a register and use Permute or Swizzle to populate two vector registers: one for odd, and another one for the even subset.

I hope this helps,

Leo.

0 Kudos
Reply