Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library
- Sparse-Sparse Matrix Multiplication

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Vineet_Y_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-09-2014
11:28 AM

131 Views

Sparse-Sparse Matrix Multiplication

Hi

I have used mkl_dcsrmultcsr in my research. However, it is performing double pass to compute sparse*sparse matrix product. For small size problems, this is not a problem, however for large size problems (e.g. matrices of size ½ billion by ½ billion) this is time consuming and it would be better if MKL can do this multiplication in a single pass in a parallel setup.

Most (if not all) of the sparse*sparse CSR matrix multiplications algorithm use Gustavson Algorithm (ACM 1978) and there is no reason why this algorithm cannot be parallelized and do calculations in a single pass. I understand that the performance of a single pass parallelization would depend on pre-allocating the space non-zero values, which I think can be reasonably given in most situations and even if this does not work the algorithm should be able to adjust the buffer size (if required).

Similarly, it would be useful to only compute lower/upper triangular portion of the output matrix (of course the output matrix have to be symmetric).

Application domain: Statistics, PDE’s, Inverse Problems, Weather Prediction.

Thanks

Vineet

Link Copied

5 Replies

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-12-2014
12:22 AM

131 Views

Vineet,

the current API doesn't allow to do that but we are thinking about such option into the next API we are working on right now.

--Gennady

Vineet_Y_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-17-2014
04:07 PM

131 Views

Hi

Gennady

After spending considerable amount of time last week I was able to come up with my own single pass parallel solution which is ~ 30 to 35% faster than mkl_dcsrmultcsr. It can also work in a hybrid setup (MPI+Openmp) and gives the output matrix in a sorted form. Any help (if possible ; no compulsions) in optimizing the attached code would be greatly appreciated as for large matrices it can save days worth of work.

Vineet

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-17-2014
09:08 PM

131 Views

Vineet_Y_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-18-2014
09:55 AM

131 Views

Hi

Gennady

I have tested the code with matrices 1070*1000000 (nzmax = 85534075) multiplied by a matrix 1000000*1070 (nzmax = 85534075) on (I do not know generation info for them):

- 2.8 GHz Intel Core i7, 16GB RAM
- 3.4 GHz Intel Core i7, 16GB RAM
- 2.80 Ghz Intel (RR) Xeon, 96 GB RAM
- 2.4 Ghz Intel i7, 8GB RAM.

You can test the code for your own matrices. I have to refactor the code .

Vineet

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-19-2014
03:24 AM

131 Views

thanks Vineet. Your suggestions looks reasonable to implement. I will bring this request to the MKL release board for consideration and may be for the implementation into one of the future versions.

regards, Gennady

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.