Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.
1696 Discussions

How to measure the overhead to ensure the parallelization is worth to perform on existing code?

rudaho
Beginner
633 Views
Hello~

I used OpenMP to paralleize my sparse matrix solver. The concept is to parallelize each column operation. However, not all matrix have good performance improvement. In fact, only one matrix which is larger than 25000x25000 obtains improvement. I got even much worse (twice or 3 times longer) performane for most of the sparse matrix. I've tried dense matrices, and the performance get better then.

I guess this is because the operations for each column is few since the matrix is sparse. Is there any suggestion or guides about how to measure the overhead on parallelizing an existing code. Such that I can know that the existing code for specific application is not worth to parallelize. Thanks...

Best Regards

Yi-Ju
0 Kudos
4 Replies
Dmitry_Vyukov
Valued Contributor I
633 Views
Quoting rudaho
Is there any suggestion or guides about how to measure the overhead on parallelizing an existing code. Such that I can know that the existing code for specific application is not worth to parallelize. Thanks...


The simplest and most reliable (and probably the fastest) method is to parallelize the code and see the results.

0 Kudos
jimdempseyatthecove
Honored Contributor III
633 Views
I might add to Dmitriy's comment that the performance changeobserved your first attempt at parallelization should not be taken as a measure of what your second attempt at parallelization will produce.

Jim
0 Kudos
kalloyd
Beginner
633 Views
I might add to Dmitriy's comment that the performance changeobserved your first attempt at parallelization should not be taken as a measure of what your second attempt at parallelization will produce.

Jim

Jim,

Truer words were never spoken. If I may add, many subsequent attempts may yield improvement, no improvement, or even decrease in performance. However, by going through these processes, you will certainly learn a lot about performance and parallelization.

There are geometries (topologies)regarding combinations ofparallel and serial execution graphs.

Ken
0 Kudos
jimdempseyatthecove
Honored Contributor III
633 Views
>>There are geometries (topologies)regarding combinations ofparallel and serial execution graphs.

The underlaying algorithms have a lot to deal with performance too. I will be addressing this in my upcomming ISN Blogs posting. Good algorithms, for tough problems take into consideration how the cache(s) are distributed about the system.

Jim Dempsey
0 Kudos
Reply