Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Threading with continuation passing

adwaitrj
Beginner
768 Views
I programmed the fibonacci series using TBB with and without continuation passing. But I found that the version with continuation passing was not performing faster than the one without. I have attached the code. Has any one else got this behaviour?
0 Kudos
1 Solution
ARCH_R_Intel
Employee
768 Views
Quoting - Raf Schietekat
Yes, it's the same file. Use of continuations is not about performance, actually: you're throwing away a perfectly good task and allocating a new one instead; I have not looked at the different things the scheduler is doing, but even there it does not appear, at first sight, to be cheaper to have to choose the next task than to simply return from wait_for_all(). Try again with recycle_as_continuation() (and report the timing results).

Yes, the continuation passing style alone doesn't save anything. It's really adding the bypassing and recycling that helps, and those generally require continuation passing. I extended the example to use variants that do bypassing and bypass+recycling. Here are the results got on an 8 core Linux box using icc 11.0, with CutOff set to 4 to exaggerate overheads:
[cpp]Fibonacci PARALLEL         40 numbers. Sum is : 102334155 in  1.832190 secs
Fibonacci PARALLEL CONT    40 numbers. Sum is : 102334155 in  2.358423 secs
Fibonacci PARALLEL BYPASS  40 numbers. Sum is : 102334155 in  1.753468 secs
Fibonacci PARALLEL RECYCLE 40 numbers. Sum is : 102334155 in  1.71276 secs
[/cpp]

I've attached my variant to this posting as test_fib.cpp.

.

View solution in original post

0 Kudos
8 Replies
ARCH_R_Intel
Employee
768 Views
The attachment seems to be missing. Please try reposting it. [The attachment mechanics for this forum have given me grief too - if someone can post instructions that work, please do so!]
0 Kudos
Rob_Ottenhoff
New Contributor I
768 Views
The attachment seems to be missing. Please try reposting it. [The attachment mechanics for this forum have given me grief too - if someone can post instructions that work, please do so!]
Hi,

There is an instruction post on the IPP forum: http://software.intel.com/en-us/forums/showthread.php?t=62285. I suppose, sorry if it doesn't, it will work here too.

Regards,

Rob
0 Kudos
adwaitrj
Beginner
768 Views
Hi,

There is an instruction post on the IPP forum: http://software.intel.com/en-us/forums/showthread.php?t=62285. I suppose, sorry if it doesn't, it will work here too.

Regards,

Rob



Hello All,

I have attached the file to the editor. Its name is cont_fib.cpp . Its appears as a blue hyperlink. Can anyone see this other than me?

cont_fib.cpp

-Adwait
0 Kudos
adwaitrj
Beginner
768 Views
Quoting - adwaitrj



Hello All,

I have attached the file to the editor. Its name is cont_fib.cpp . Its appears as a blue hyperlink. Can anyone see this other than me?

cont_fib.cpp

-Adwait



I am trying again to attach the file. Please tell me if you can see it.

cont_fib.cpp


Thanks.
Adwait
0 Kudos
RafSchietekat
Valued Contributor III
768 Views
Yes, it's the same file. Use of continuations is not about performance, actually: you're throwing away a perfectly good task and allocating a new one instead; I have not looked at the different things the scheduler is doing, but even there it does not appear, at first sight, to be cheaper to have to choose the next task than to simply return from wait_for_all(). Try again with recycle_as_continuation() (and report the timing results).
0 Kudos
ARCH_R_Intel
Employee
769 Views
Quoting - Raf Schietekat
Yes, it's the same file. Use of continuations is not about performance, actually: you're throwing away a perfectly good task and allocating a new one instead; I have not looked at the different things the scheduler is doing, but even there it does not appear, at first sight, to be cheaper to have to choose the next task than to simply return from wait_for_all(). Try again with recycle_as_continuation() (and report the timing results).

Yes, the continuation passing style alone doesn't save anything. It's really adding the bypassing and recycling that helps, and those generally require continuation passing. I extended the example to use variants that do bypassing and bypass+recycling. Here are the results got on an 8 core Linux box using icc 11.0, with CutOff set to 4 to exaggerate overheads:
[cpp]Fibonacci PARALLEL         40 numbers. Sum is : 102334155 in  1.832190 secs
Fibonacci PARALLEL CONT    40 numbers. Sum is : 102334155 in  2.358423 secs
Fibonacci PARALLEL BYPASS  40 numbers. Sum is : 102334155 in  1.753468 secs
Fibonacci PARALLEL RECYCLE 40 numbers. Sum is : 102334155 in  1.71276 secs
[/cpp]

I've attached my variant to this posting as test_fib.cpp.

.
0 Kudos
RafSchietekat
Valued Contributor III
768 Views
And recycle_as_continuation()?
0 Kudos
adwaitrj
Beginner
768 Views
Quoting - Raf Schietekat
And recycle_as_continuation()?

Thanks to Arch Robison and Raf Schietekat.

I have realised what I was missing in my code. As I understand, I had simply had a continuation-passing style code without any attempt to benefit from the continuation-passing. I believe, this techniques yields much more control over execution order of functions in a recursuve descent of calls. I stand to benefit a lot if I can control the sequence of execution of the recursive function such that it aids Spatial/temporal locality. Added advantage of reduced overhead by recycling/bypass will be soothing. Once again thanks. Your answers have really help me understand the topic.

Regards,
Adwait
0 Kudos
Reply