topic Hi Saar, in Intel® oneAPI Math Kernel Library

vector array in mkl

saar_w_ — Tue, 28 Jun 2016 17:04:49 GMT

Hi does anyone know what is the best mapping in MKL for ippmMul_vaca_64f.

Thanks,

Saar.

Hi Saar,

Ying_H_Intel — Wed, 29 Jun 2016 01:36:44 GMT

Hi Saar,

I guess, you have known that ippm was deprecated in latest IPP version, and the replacement may be in MKL BLAS function or VSL functions.

https://software.intel.com/en-us/articles/the-alternatives-for-intel-ipp-legacy-small-matrices-domain

No sure what is your array layout, next operation, the vector size, machine type etc. You may search in MKL reference machine, there are a few of function be able to do scalar * vector

for example, cblas_dscal()

The ?scal routines perform a vector operation defined as x = a*x
where: a is a scalar, x is an n-element vector

The ?axpy routines perform a vector-vector operation defined as
y := a*x + y

VSL functions: vdLinearFrac( n, a, b, scalea, shifta, scaleb, shiftb, y ) , y=(scalea·a+shifta)/(scaleb·b+shiftb), i=1,2 … n

or you create a matrix * vector according your vector array etc. Which usually have better performance.

Best Regards,
Ying

Hi Ying and thank you for

saar_w_ — Wed, 29 Jun 2016 06:40:43 GMT

Hi Ying and thank you for your quick replay ,

I am familiar with cblas functions but my problem is very specific for batch operation

the method ippmMul_vaca_64f is multiuplyng array of vectors with array of scalar ,

just like calling mutipale times to ?axpy. the difference is that it multiply all the vectors multi threaded and at once. how can I get the same result in MKL?

Ying H. (Intel) wrote:

Hi Saar,

I guess, you have known that ippm was deprecated in latest IPP version, and the replacement may be in MKL BLAS function or VSL functions.

https://software.intel.com/en-us/articles/the-alternatives-for-intel-ipp...

No sure what is your array layout, next operation, the vector size, machine type etc. You may search in MKL reference machine, there are a few of function be able to do scalar * vector

for example, cblas_dscal()

The ?scal routines perform a vector operation defined as x = a*x
where: a is a scalar, x is an n-element vector

The ?axpy routines perform a vector-vector operation defined as
y := a*x + y

VSL functions: vdLinearFrac( n, a, b, scalea, shifta, scaleb, shiftb, y ) , y=(scalea·a+shifta)/(scaleb·b+shiftb), i=1,2 … n

or you create a matrix * vector according your vector array etc. Which usually have better performance.

Best Regards,
Ying

Hi Saar,

Ying_H_Intel — Fri, 01 Jul 2016 07:47:59 GMT

Hi Saar,

Right, current MKL only provide dgemm's batch function, no others, Do you have following operation after you get aX1, bX2, cX3, ... (a,b,c are constant number, XI is vector, how was the length and number of your vector array ?). If no, it seems you have to call one blas function several times. or you may add OpenMP progam to parallel the calls.

Best Regards,
Ying