- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I want to use scatter/ gather operation on MIC. I could not find any example which shows the usage of these operations. I wrote one sample program which scatter an array. if comment "mm512_i32scatter_ps" operation, program is executing with out any problem. if i use "mm512_i32scatter_ps" operation, I am getting error "offload error: process on the device 0 was terminated by signal 11 (SIGSEGV)". Please some help me out.
#include<stdio.h>
#include<stdlib.h>
#include<immintrin.h>
int main()
{
printf("CPU\n");
#pragma offload target(mic)
{
int* a = (int*)_mm_malloc(sizeof(int)*16,64);
for(int i=0; i<16;i++){
a = i;
}
int* index = (int*)_mm_malloc(sizeof(int)*16,64);
for(int i=0; i<16;i++){
index = 15-i;
}
printf("xeonphi\n");
int* b = (int*)_mm_malloc(sizeof(int)*16,64);
__m512i index1 = _mm512_load_epi32((void*)index);
__m512 v1 = _mm512_load_ps((void*)a);
_mm512_i32scatter_ps((void*)b,index1,v1,1);
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Please try to define indexes that are aligned in the boundaries of the datatype in use. Note that the first argument in extern void __cdecl _mm512_i32scatter_ps(void* mv, __m512i index, __m512 v1, int scale) is byte aligned. So the “index” variable should take that into account. Try to change from index = 15-i; to index = (15-i)*sizeof(int); Best, Leo.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Please try to define indexes that are aligned in the boundaries of the datatype in use. Note that the first argument in extern void __cdecl _mm512_i32scatter_ps(void* mv, __m512i index, __m512 v1, int scale) is byte aligned. So the “index” variable should take that into account. Try to change from index = 15-i; to index = (15-i)*sizeof(int); Best, Leo.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you lee..
your suggestion is working.
I actually want to scatter 1024 elements into two groups. even index elements will be in one group and odd index elements will be in other group. For instance if my input array is {0,1,2,3,4,5,6,7,8,9,10}. output should be {0,2,4,6,10,1,3,5,7,9} or two seperate arrays{0,2,4,6,8,10}, {1,3,5,7,9}.
one possible solution can be create an index array = {0,5,1,6,2,7,3,8,4,9,5,10}.
But here i am doing a strided access. So is there any intrensic which serves this purpose.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you lee..
your suggestion is working.
I actually want to scatter 1024 elements into two groups. even index elements will be in one group and odd index elements will be in other group. For instance if my input array is {0,1,2,3,4,5,6,7,8,9,10}. output should be {0,2,4,6,10,1,3,5,7,9} or two seperate arrays{0,2,4,6,8,10}, {1,3,5,7,9}.
one possible solution can be create an index array = {0,5,1,6,2,7,3,8,4,9,5,10}.
But here i am doing a strided access. So is there any intrensic which serves this purpose.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Shiva,
Have you tried to look either at the Permute instruction (_mm512_mask_permutevar_epi32) or the Swizzle ( _mm512_swizzle_*). With proper masks you should be able to get what you want.
Maybe load groups of 16 elements of your input array into a register and use Permute or Swizzle to populate two vector registers: one for odd, and another one for the even subset.
I hope this helps,
Leo.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page