Load data in AVX512 vector with strides

Rizwan1 · ‎04-11-2022

Hi,

I need to load data with strides in AVX512 vector register. What is the best way to do this.

Lets suppose the stride is of 1000 and I need to load data at index 0,1*1000, 2*1000, 3*1000, 4*1000, 5*1000 , 6*1000 and 7*1000 in one AVX512 vector register.

What is the fastest way to do this. Which intrinsic should be used to do this. Data is double precision floating point numbers.

NoorjahanSk_Intel · ‎04-13-2022

Hi,

Thanks for reaching out to us.

Could you please try using _mm512_i32gather_pd intrinsic to load the data in AVX512 registers as you can use scale factor for an index vector.

Please refer to below link for more details:

https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#=undefined&ig_expand=3910,4260,6145,3907,3913&text=%25252520_mm512_i32gather

>>What is the fastest way to do this.

To work efficiently, one does not update the indexes, instead, it is better to update the base address to the (next) first of stride to load (this conserves a vector register).

Thanks & Regards,

Noorjahan.

NoorjahanSk_Intel · ‎04-20-2022

Hi,

We haven't heard back from you. Could you please provide an update on your issue?

Thanks & Regards,

Noorjahan.

Rizwan1 · ‎04-21-2022

Thanks, These commands help me in doing so but the performance is not as I was expecting. The copy of stride data using cilk array is better than using AVX512.

According to intel intrinsic guide the gather latency is high as compared to the load or store. But I am facing performance degradation on store operation as compared to gather.

__m512d _A0 = _mm512_i64gather_pd(vidx , &AS[source_location], 8);
_mm512_storeu_pd(&AD[destination_location], _A0);

These copy commands are with in nested loops which are parallelized using OpenMP. In every iteration location is changed. I observed that store operation take most of the time. Any good way to optimize it?

Thanks

NoorjahanSk_Intel · ‎04-22-2022

Hi,

Could you please provide us with a complete reproducer, so that we can investigate more on your issue?

Thanks & Regards

Noorjahan.

NoorjahanSk_Intel · ‎04-29-2022

Hi,

We haven't heard back from you. Could you please provide an update on your issue along with the above-requested details?

Thanks & Regards,

Noorjahan.

NoorjahanSk_Intel · ‎05-08-2022

Hi,

I have not heard back from you, so I will close this inquiry now. If you need further assistance, please post a new question.

Thanks & Regards,

Noorjahan.