How to implement numpy broadcast mechanism with mkl?
I have been confused, how to use mkl to efficiently implement the broadcast mechanism in numpy ((Element wise operator "+","","*")?
such as
2D array sub 1D array
[[1,2,3],
[4,5,6],
[7,8,9]]

[1, 2, 3]
=
[[0, 0, 0],
[3, 3, 3],
[6, 6, 6]]
And the second operation (can be understood as a matrix multiplied by a diagonal matrix)
2D array multiply 1D array(Element wise multiply )
[[1,2,3],
[4,5,6],
[7,8,9]]
*
[1, 2, 3]
=
[[1, 4, 9],
[4, 10, 18],
[7, 16, 27]]
I tried to implement with the for loop +cblas_dscal/vdSub
But I think this is not efficient, I don't know if there is any better implementation.
Link Copied
You are right, you may use loop +cblas_dscal/vdSub but this approach will not be efficient. current version of MKL doesn't provide such functionality,
in the case if user experiences some performance problem when using numpy, scipy or etc  Our recommendation to try optimized version of Python where many of math operations are optimized by Intel Performance libraries (IPP, MKL and DAAL) as a backend.
For more complete information about compiler optimizations, see our Optimization Notice.