Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library
- mkl/blas routine for C=AA'B

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

may_ka

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-04-2019
11:15 PM

33 Views

mkl/blas routine for C=AA'B

Hi there,

is there any mkl/blas function which performs the operation C=AA'B in on go. Currently I use an intermediate array T and dgemm: T=A'B;C=AT'. I am wondering whether there is a more efficient way since A is always the same matrix.

Thanks.

Link Copied

2 Replies

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-05-2019
04:51 AM

33 Views

Did you mean to write C = AT rather than C = AT' ?

If A is "always the same", you could form T = AA' just once, using ?GEMM. Then, do the ?GEMM operation C = TB for each B as you go.

A more useful answer could be given if you describe how C is going to be used. Are the matrices square?

may_ka

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-05-2019
12:33 PM

33 Views

Hi,

Thanks for the response.

I may have mixed the dimensions/transpositions .............. but in essence T is such that it can serve as an intermediate matrix.

A is always the same. Generating AA' would simply be a rank update (?syrk) yielding a symmetric matrix. However, AA' is not feasible because of its dimensions. A can be anything but can have row dimension of several hundreds of thousands. In contrast its column dimension may be only several 10th of thousand. The same holds for B.

The whole system is accutally a nice example for a situation where T=A'B and subsequently C=AT' is much faster (even when done repeatedly) than forming T=AA' once (if feasible) and doing C=TB.

C must have the same dimension as matrix B. It is neither squared nor symmetric.

My thoughts were since A is the same in both operations and C must be the same dimension as B, there might be a more efficient way than putting A through the CPU twice.

Cheers

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.