Solved: Threading with continuation passing

adwaitrj · ‎01-10-2009

I programmed the fibonacci series using TBB with and without continuation passing. But I found that the version with continuation passing was not performing faster than the one without. I have attached the code. Has any one else got this behaviour?

ARCH_R_Intel · ‎01-29-2009

Quoting - Raf Schietekat

Yes, it's the same file. Use of continuations is not about performance, actually: you're throwing away a perfectly good task and allocating a new one instead; I have not looked at the different things the scheduler is doing, but even there it does not appear, at first sight, to be cheaper to have to choose the next task than to simply return from wait_for_all(). Try again with recycle_as_continuation() (and report the timing results).

Yes, the continuation passing style alone doesn't save anything. It's really adding the bypassing and recycling that helps, and those generally require continuation passing. I extended the example to use variants that do bypassing and bypass+recycling. Here are the results got on an 8 core Linux box using icc 11.0, with CutOff set to 4 to exaggerate overheads:

[cpp]Fibonacci PARALLEL         40 numbers. Sum is : 102334155 in  1.832190 secs
Fibonacci PARALLEL CONT    40 numbers. Sum is : 102334155 in  2.358423 secs
Fibonacci PARALLEL BYPASS  40 numbers. Sum is : 102334155 in  1.753468 secs
Fibonacci PARALLEL RECYCLE 40 numbers. Sum is : 102334155 in  1.71276 secs
[/cpp]

I've attached my variant to this posting as test_fib.cpp.

.

View solution in original post

ARCH_R_Intel · ‎01-12-2009

The attachment seems to be missing. Please try reposting it. [The attachment mechanics for this forum have given me grief too - if someone can post instructions that work, please do so!]

Rob_Ottenhoff · ‎01-13-2009

Quoting - Arch Robison (Intel)

The attachment seems to be missing. Please try reposting it. [The attachment mechanics for this forum have given me grief too - if someone can post instructions that work, please do so!]

Hi,

There is an instruction post on the IPP forum: http://software.intel.com/en-us/forums/showthread.php?t=62285. I suppose, sorry if it doesn't, it will work here too.

Regards,

Rob

adwaitrj · ‎01-18-2009

cont_fib.cppQuoting - Rob Ottenhoff

Hi,

There is an instruction post on the IPP forum: http://software.intel.com/en-us/forums/showthread.php?t=62285. I suppose, sorry if it doesn't, it will work here too.

Regards,

Rob

Hello All,

I have attached the file to the editor. Its name is cont_fib.cpp . Its appears as a blue hyperlink. Can anyone see this other than me?

cont_fib.cpp

-Adwait

adwaitrj · ‎01-27-2009

Quoting - adwaitrj

Hello All,

I have attached the file to the editor. Its name is cont_fib.cpp . Its appears as a blue hyperlink. Can anyone see this other than me?

cont_fib.cpp

-Adwait

I am trying again to attach the file. Please tell me if you can see it.

cont_fib.cpp

Thanks.
Adwait

RafSchietekat · ‎01-27-2009

Yes, it's the same file. Use of continuations is not about performance, actually: you're throwing away a perfectly good task and allocating a new one instead; I have not looked at the different things the scheduler is doing, but even there it does not appear, at first sight, to be cheaper to have to choose the next task than to simply return from wait_for_all(). Try again with recycle_as_continuation() (and report the timing results).

ARCH_R_Intel · ‎01-29-2009

Quoting - Raf Schietekat

Yes, it's the same file. Use of continuations is not about performance, actually: you're throwing away a perfectly good task and allocating a new one instead; I have not looked at the different things the scheduler is doing, but even there it does not appear, at first sight, to be cheaper to have to choose the next task than to simply return from wait_for_all(). Try again with recycle_as_continuation() (and report the timing results).

Yes, the continuation passing style alone doesn't save anything. It's really adding the bypassing and recycling that helps, and those generally require continuation passing. I extended the example to use variants that do bypassing and bypass+recycling. Here are the results got on an 8 core Linux box using icc 11.0, with CutOff set to 4 to exaggerate overheads:

[cpp]Fibonacci PARALLEL         40 numbers. Sum is : 102334155 in  1.832190 secs
Fibonacci PARALLEL CONT    40 numbers. Sum is : 102334155 in  2.358423 secs
Fibonacci PARALLEL BYPASS  40 numbers. Sum is : 102334155 in  1.753468 secs
Fibonacci PARALLEL RECYCLE 40 numbers. Sum is : 102334155 in  1.71276 secs
[/cpp]

I've attached my variant to this posting as test_fib.cpp.

.

RafSchietekat · ‎01-29-2009

And recycle_as_continuation()?

adwaitrj · ‎02-01-2009

Quoting - Raf Schietekat

And recycle_as_continuation()?

Thanks to Arch Robison and Raf Schietekat.

I have realised what I was missing in my code. As I understand, I had simply had a continuation-passing style code without any attempt to benefit from the continuation-passing. I believe, this techniques yields much more control over execution order of functions in a recursuve descent of calls. I stand to benefit a lot if I can control the sequence of execution of the recursive function such that it aids Spatial/temporal locality. Added advantage of reduced overhead by recycling/bypass will be soothing. Once again thanks. Your answers have really help me understand the topic.

Regards,
Adwait