Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
6742 Discussions

fastest [parallel] way to initialize a column vector?

Azua_Garcia__Giovann
517 Views

Hello,

I could not find a BLAS-1 function that would fit this bill and was wondering whether MKL has a function implementation for this. I know stl or boost has i.e. fill but I prefer one impl. that takes into account locality and parallelizes.

I create a column vector and want to initialize it to a given double value. memset only works for zeroing out the memory but AFAIK it boils down to doing a not so efficient loop.

Below my C code (pending moving to C++), is there a way to do this in a more locality/vectorized/parallel high performance way using MKL?

Many thanks in advance,
Best regards,
Giovanni

/**
* Initialize a vector with a default value
*/
tvector* vector_init_with_default(int capacity, double value) {
tvector* result = vector_init_static(capacity);
for (int i = 0; i < capacity; ++i) {
result->data = value;
}

return result;
}

0 Kudos
1 Solution
TimP
Black Belt
517 Views
Any auto-vectorizing compiler should generate effective code. As you don't give any hints about array size, you may get an advantage with icc by #pragma vector nontemporal if you expect the array to apan several 4KB pages.
memset should switch automatically into nontemporal stores for suitably large arrays; the inefficiency is in the time taken to determine which branch to take, which isn't significant if the array is that large. icc will substitute memset automatically if it knows value==0.

View solution in original post

2 Replies
TimP
Black Belt
518 Views
Any auto-vectorizing compiler should generate effective code. As you don't give any hints about array size, you may get an advantage with icc by #pragma vector nontemporal if you expect the array to apan several 4KB pages.
memset should switch automatically into nontemporal stores for suitably large arrays; the inefficiency is in the time taken to determine which branch to take, which isn't significant if the array is that large. icc will substitute memset automatically if it knows value==0.
Gennady_F_Intel
Moderator
517 Views
>> is there a way to do this in a more locality/vectorized/parallel high performance way using MKL?
there are no specific mkl's functions doing this operations because of ...see Timothy's answers above.
Reply