Community
cancel
Showing results for 
Search instead for 
Did you mean: 
rudaho
Beginner
36 Views

How to measure the overhead to ensure the parallelization is worth to perform on existing code?

Hello~

I used OpenMP to paralleize my sparse matrix solver. The concept is to parallelize each column operation. However, not all matrix have good performance improvement. In fact, only one matrix which is larger than 25000x25000 obtains improvement. I got even much worse (twice or 3 times longer) performane for most of the sparse matrix. I've tried dense matrices, and the performance get better then.

I guess this is because the operations for each column is few since the matrix is sparse. Is there any suggestion or guides about how to measure the overhead on parallelizing an existing code. Such that I can know that the existing code for specific application is not worth to parallelize. Thanks...

Best Regards

Yi-Ju
0 Kudos
4 Replies
Dmitry_Vyukov
Valued Contributor I
36 Views

Quoting rudaho
Is there any suggestion or guides about how to measure the overhead on parallelizing an existing code. Such that I can know that the existing code for specific application is not worth to parallelize. Thanks...


The simplest and most reliable (and probably the fastest) method is to parallelize the code and see the results.

jimdempseyatthecove
Black Belt
36 Views

I might add to Dmitriy's comment that the performance changeobserved your first attempt at parallelization should not be taken as a measure of what your second attempt at parallelization will produce.

Jim
kalloyd
Beginner
36 Views

I might add to Dmitriy's comment that the performance changeobserved your first attempt at parallelization should not be taken as a measure of what your second attempt at parallelization will produce.

Jim

Jim,

Truer words were never spoken. If I may add, many subsequent attempts may yield improvement, no improvement, or even decrease in performance. However, by going through these processes, you will certainly learn a lot about performance and parallelization.

There are geometries (topologies)regarding combinations ofparallel and serial execution graphs.

Ken
jimdempseyatthecove
Black Belt
36 Views

>>There are geometries (topologies)regarding combinations ofparallel and serial execution graphs.

The underlaying algorithms have a lot to deal with performance too. I will be addressing this in my upcomming ISN Blogs posting. Good algorithms, for tough problems take into consideration how the cache(s) are distributed about the system.

Jim Dempsey
Reply