Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7222 Discussions

Matrix - Vector Matrix operation A_ij = alpha B_ik c_k D_kj + beta A_ij?

gullcphys_ethz_ch
710 Views
Hi,

I'm looking for an efficient way of computing the matrix expression

A_ij = alpha B_ik c_k D_kj + beta A_ij

...or its simpler version:
A_ij = B_ik c_k D_kj + A_ij

were alpha, beta are constants, A,B,D are matrices, and c is a vector. Summation over repeated indices is implied.

It does look like it should be very easy to modify a dgemm kernel to do this operation, however by applying only blas calls one is stuck by either first scaling D_kj with c_k, or B_ik with c_k, or hand-writing a loop to do this computation.

As this is already the second time I see this operation appear. So I was wondering: am I missing something obvious, or does such a function really not exist?

Best,
Emanuel
0 Kudos
1 Reply
Sergey_K_Intel2
Employee
710 Views
Hi Emanuel,

I see an efficient way is that you mentioned yet - first scaling D_kj with c_k, or B_ik with c_k and than to call DGEMM with A, B and D
You could also use blas call for this scaling
B scaling will be more efficient in case of column-major order of your matrix

do ii=1, num_columns_of_B
call dscal(num_rows_of_B,B(1,ii),c(ii),1)
enddo
In row-major case it seems more effective to do D scaling.

Since scaling operation is ~n^2 ops but our full operation is ~n^3 ops the overhead isnt supposed to be significant.

As to It does look like it should be very easy to modify a dgemm kernel to do this operation ,
I dont think so because
the loop by k
temp += B_ik c_k D_kj
involves k additional multiplies against
temp += B_ik D_kj

So if we did implementation of this operation we would do roughly the same scaling and than matrix-matrix multiplication.

Best regards,
Sergey

0 Kudos
Reply