- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm looking for an efficient way of computing the matrix expression
A_ij = alpha B_ik c_k D_kj + beta A_ij
...or its simpler version:
A_ij = B_ik c_k D_kj + A_ij
were alpha, beta are constants, A,B,D are matrices, and c is a vector. Summation over repeated indices is implied.
It does look like it should be very easy to modify a dgemm kernel to do this operation, however by applying only blas calls one is stuck by either first scaling D_kj with c_k, or B_ik with c_k, or hand-writing a loop to do this computation.
As this is already the second time I see this operation appear. So I was wondering: am I missing something obvious, or does such a function really not exist?
Best,
Emanuel
I'm looking for an efficient way of computing the matrix expression
A_ij = alpha B_ik c_k D_kj + beta A_ij
...or its simpler version:
A_ij = B_ik c_k D_kj + A_ij
were alpha, beta are constants, A,B,D are matrices, and c is a vector. Summation over repeated indices is implied.
It does look like it should be very easy to modify a dgemm kernel to do this operation, however by applying only blas calls one is stuck by either first scaling D_kj with c_k, or B_ik with c_k, or hand-writing a loop to do this computation.
As this is already the second time I see this operation appear. So I was wondering: am I missing something obvious, or does such a function really not exist?
Best,
Emanuel
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Emanuel,
I see an efficient way is that you mentioned yet - first scaling D_kj with c_k, or B_ik with c_k and than to call DGEMM with A, B and D
You could also use blas call for this scaling
B scaling will be more efficient in case of column-major order of your matrix
do ii=1, num_columns_of_B
call dscal(num_rows_of_B,B(1,ii),c(ii),1)
enddo
In row-major case it seems more effective to do D scaling.
Since scaling operation is ~n^2 ops but our full operation is ~n^3 ops the overhead isnt supposed to be significant.
As to It does look like it should be very easy to modify a dgemm kernel to do this operation ,
I dont think so because
the loop by k
temp += B_ik c_k D_kj
involves k additional multiplies against
temp += B_ik D_kj
So if we did implementation of this operation we would do roughly the same scaling and than matrix-matrix multiplication.
Best regards,
Sergey
I see an efficient way is that you mentioned yet - first scaling D_kj with c_k, or B_ik with c_k and than to call DGEMM with A, B and D
You could also use blas call for this scaling
B scaling will be more efficient in case of column-major order of your matrix
do ii=1, num_columns_of_B
call dscal(num_rows_of_B,B(1,ii),c(ii),1)
enddo
In row-major case it seems more effective to do D scaling.
Since scaling operation is ~n^2 ops but our full operation is ~n^3 ops the overhead isnt supposed to be significant.
As to It does look like it should be very easy to modify a dgemm kernel to do this operation ,
I dont think so because
the loop by k
temp += B_ik c_k D_kj
involves k additional multiplies against
temp += B_ik D_kj
So if we did implementation of this operation we would do roughly the same scaling and than matrix-matrix multiplication.
Best regards,
Sergey

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page