- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello~
I used OpenMP to paralleize my sparse matrix solver. The concept is to parallelize each column operation. However, not all matrix have good performance improvement. In fact, only one matrix which is larger than 25000x25000 obtains improvement. I got even much worse (twice or 3 times longer) performane for most of the sparse matrix. I've tried dense matrices, and the performance get better then.
I guess this is because the operations for each column is few since the matrix is sparse. Is there any suggestion or guides about how to measure the overhead on parallelizing an existing code. Such that I can know that the existing code for specific application is not worth to parallelize. Thanks...
Best Regards
Yi-Ju
I used OpenMP to paralleize my sparse matrix solver. The concept is to parallelize each column operation. However, not all matrix have good performance improvement. In fact, only one matrix which is larger than 25000x25000 obtains improvement. I got even much worse (twice or 3 times longer) performane for most of the sparse matrix. I've tried dense matrices, and the performance get better then.
I guess this is because the operations for each column is few since the matrix is sparse. Is there any suggestion or guides about how to measure the overhead on parallelizing an existing code. Such that I can know that the existing code for specific application is not worth to parallelize. Thanks...
Best Regards
Yi-Ju
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting rudaho
Is there any suggestion or guides about how to measure the overhead on parallelizing an existing code. Such that I can know that the existing code for specific application is not worth to parallelize. Thanks...
The simplest and most reliable (and probably the fastest) method is to parallelize the code and see the results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I might add to Dmitriy's comment that the performance changeobserved your first attempt at parallelization should not be taken as a measure of what your second attempt at parallelization will produce.
Jim
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting jimdempseyatthecove
I might add to Dmitriy's comment that the performance changeobserved your first attempt at parallelization should not be taken as a measure of what your second attempt at parallelization will produce.
Jim
Jim
Truer words were never spoken. If I may add, many subsequent attempts may yield improvement, no improvement, or even decrease in performance. However, by going through these processes, you will certainly learn a lot about performance and parallelization.
There are geometries (topologies)regarding combinations ofparallel and serial execution graphs.
Ken
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>There are geometries (topologies)regarding combinations ofparallel and serial execution graphs.
The underlaying algorithms have a lot to deal with performance too. I will be addressing this in my upcomming ISN Blogs posting. Good algorithms, for tough problems take into consideration how the cache(s) are distributed about the system.
Jim Dempsey
The underlaying algorithms have a lot to deal with performance too. I will be addressing this in my upcomming ISN Blogs posting. Good algorithms, for tough problems take into consideration how the cache(s) are distributed about the system.
Jim Dempsey

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page