- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I tried to parallelise a function with cilk plus (the function is basicaly a periodical convolution with transposition).
The function has 3 nested "for" loops. Basicaly, in a first implementation I only have changed the "for" to "cilk_for". I tried to change only the first one, or the two first, but without change in performances. The function is "convSerial_cilk", printed at the end of this post. The iteration space can be large (the first for loop iterates from 0 to 20000)
Because I had poor performance, I tried to usethe "cilkview" tools (from the SDK).
I call my function like this (with the cilkview API to profile my code) : [cpp] cilkview_data_t d;
__cilkview_query(d);
convSerial_cilk(num_elements_dim1*num_elements_dim3,num_elements_dim2,out_data_cilk,h_data,f_data,fSIZE);
__cilkview_report(&d, NULL, "main_tag", CV_REPORT_WRITE_TO_RESULTS); [/cpp]
I get these results :
Whole Program Statistics
1) Parallelism Profile
Work : 3,280,552,525 instructions
Span : 1,512,348,513 instructions
Burdened span : 1,513,138,473 instructions
Parallelism : 2.17
Burdened parallelism : 2.17
Number of spawns/syncs: 84,500
Average instructions / strand : 12,940
Strands along span : 65
Average instructions / strand on span : 23,266,900
Total number of atomic instructions : 84,506
Frame count : 169,000
2) Speedup Estimate
2 processors: 1.12 - 2.00
4 processors: 1.19 - 2.17
8 processors: 1.23 - 2.17
16 processors: 1.25 - 2.17
32 processors: 1.26 - 2.17
64 processors: 1.27 - 2.17
128 processors: 1.27 - 2.17
256 processors: 1.27 - 2.17
Cilk Parallel Region(s) Statistics - Elapsed time: 7.392 seconds
1) Parallelism Profile
Work : 1,768,253,582 instructions
Span : 49,570 instructions
Burdened span : 839,530 instructions
Parallelism : 35671.85
Burdened parallelism : 2106.24
Number of spawns/syncs: 84,500
Average instructions / strand : 6,975
Strands along span : 32
Average instructions / strand on span : 1,549
Total number of atomic instructions : 84,506
Frame count : 169,000
Entries to parallel region : 2
2) Speedup Estimate
2 processors: 1.90 - 2.00
4 processors: 3.80 - 4.00
8 processors: 7.60 - 8.00
16 processors: 15.20 - 16.00
32 processors: 30.40 - 32.00
64 processors: 60.80 - 64.00
128 processors: 116.10 - 128.00
256 processors: 212.30 - 256.00
In the Cilk specific part, cilkview indicates that I can expect to have good performance. Nevertheless the cilk version of my function is slover than the sequential one ! Furthermore, if I increase the number of worker, there is no effect on the performance of my function !
With cilkview, I have generated a plot (enclosed with this post). (launched on a dual Xeon E5-2670, I can use up to 16 CPU cores)
We can see that the theoretical speed-up should be good (burdened speed-up). But the measured speed-up is very bad (trials)
So why I get so much differences between the cilkview estimation and my real measures ? What should I check to increase my cilk plus performance ?
Thanks,
The function :
[cpp] void convSerial_cilk(unsigned int n1,unsigned int n2,double *restrict tab_out, double *restrict tab_in,const double *restrict in_f,int nf)
{
unsigned int mod;
cilk_for(unsigned int i=0;i < n1;++i)
{
for(unsigned int j=0;j < n2;++j)
{
double tmp = 0;
mod = j;
for(unsigned int k=0 ;k < nf;++k)
{
if(mod >= n2)
mod = 0;
tmp += tab_in[i*n2 + mod]*in_f
++mod;
}
tab_out[j*n1 + i] = tmp;
}
}
} [/cpp]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page