Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library
- matrix multiplication speedup

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Bowen_M_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-12-2014
12:03 PM

46 Views

matrix multiplication speedup

I'm using cblas_dgemm to calculate matrix multiplication. For random generated matrix X of size N * N (N could be 100), I calculate Y = X^T * X. (X^T is the tranpose of X). I can do it in two ways: (1) using cblas_dgemm to calculate Y directly (2) using a forloop that for i = 1:N, Y += X* * X ^T, where X is the i_th column of X. *

By comparing the speed, theoretically, they should have same complexity of N^3. But in reality, (2) way might take 4 times longer than (1). Could you help me to understand this?

Thanks

Link Copied

2 Replies

VipinKumar_E_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-14-2014
08:10 PM

46 Views

In the first case of blas dgemm, there are multiple optimizations techniques are used, that include loop reordering, loop unrolling, subdividing into blocks, vectorization, parallelizations etc. These help to keep the frequently used data in cache, reduce branch instructions, utilize DLP (data level parallelism) and TLP (thread level parallelism) etc. Many other optimizations are also done in various MKL routines.

--Vipin

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-15-2014
08:58 AM

46 Views

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.