Community
cancel
Showing results for 
Search instead for 
Did you mean: 
151 Views

fastest [parallel] way to initialize a column vector?

Jump to solution

Hello,

I could not find a BLAS-1 function that would fit this bill and was wondering whether MKL has a function implementation for this. I know stl or boost has i.e. fill but I prefer one impl. that takes into account locality and parallelizes.

I create a column vector and want to initialize it to a given double value. memset only works for zeroing out the memory but AFAIK it boils down to doing a not so efficient loop.

Below my C code (pending moving to C++), is there a way to do this in a more locality/vectorized/parallel high performance way using MKL?

Many thanks in advance,
Best regards,
Giovanni

/**
* Initialize a vector with a default value
*/
tvector* vector_init_with_default(int capacity, double value) {
tvector* result = vector_init_static(capacity);
for (int i = 0; i < capacity; ++i) {
result->data = value;
}

return result;
}

0 Kudos

Accepted Solutions
TimP
Black Belt
151 Views
Any auto-vectorizing compiler should generate effective code. As you don't give any hints about array size, you may get an advantage with icc by #pragma vector nontemporal if you expect the array to apan several 4KB pages.
memset should switch automatically into nontemporal stores for suitably large arrays; the inefficiency is in the time taken to determine which branch to take, which isn't significant if the array is that large. icc will substitute memset automatically if it knows value==0.

View solution in original post

2 Replies
TimP
Black Belt
152 Views
Any auto-vectorizing compiler should generate effective code. As you don't give any hints about array size, you may get an advantage with icc by #pragma vector nontemporal if you expect the array to apan several 4KB pages.
memset should switch automatically into nontemporal stores for suitably large arrays; the inefficiency is in the time taken to determine which branch to take, which isn't significant if the array is that large. icc will substitute memset automatically if it knows value==0.

View solution in original post

Gennady_F_Intel
Moderator
151 Views
>> is there a way to do this in a more locality/vectorized/parallel high performance way using MKL?
there are no specific mkl's functions doing this operations because of ...see Timothy's answers above.