Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development Topics
- Intel® Moderncode for Parallel Architectures
- How to parallelize the following code by OpenMP?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Zhanghong_T_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-22-2010
05:19 PM

33 Views

How to parallelize the following code by OpenMP?

The following code is used to calculate y = A * x as the function

do i=1,n ! matrix A is the size of n*n

y(i)=0.0d0

enddo

do i=1,nnz ! nnz is the number of entries of A

ii = irow(i) ! irow is the row number index of entries

jj = jcol(i) ! jcol is the column number index of entries

y(ii) = y(ii) + A(i)*x(jj)

enddo

Could anyone give me some suggestion about how to parallelize the code by OpenMP?

Thanks,

Zhanghong Tang

Link Copied

3 Replies

Zhanghong_T_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-23-2010
03:20 PM

33 Views

do i=1,n ! matrix A is the size of n*n

y(i)=0.0d0

enddo

do i=1,nnz ! nnz is the number of entries of A

ii = irow(i) ! irow is the row number index of entries

jj = jcol(i) ! jcol is the column number index of entries

y(ii) = y(ii) + A(i)*x(jj)

if(issymmetric.and.ii/=jj)y(jj)=y(jj)+A(i)*x(ii)

enddo

How to parallelize the code by OpenMP?

Thanks,

Zhanghong Tang

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-26-2010
11:32 AM

33 Views

I suppose it's likely you might use such a scheme for packed storage of a symmetric matrix, where the submatrices associated with various values of i don't overlap. Then, you might start out with

!omp$ do parallel private(ii,jj) schedule(guided)

do i=1,nnz

...

You would likely find that it doesn't scale to all cores on a NUMA machine such as Nehalem or Opteron. Assuming that you keep the storage in a sane order (no skipping around with irow and jcol), you could throw on a static scheduled outer loop over the number of available threads, where you determine which range of i should be processed by each thread in order to balance the work.

Zhanghong_T_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-27-2010
01:31 AM

33 Views

Thanks for your kindly reply. I know that the sparse matrices are stored by CSR format and then the calculation was parallelized. However, since the MKL function

Thanks,

Zhanghong Tang

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.