Software Archive
Read-only legacy content

why spawn helper cannot be inlined

Yonghong_Y_
Beginner
468 Views

I am reading the ABI doc and also made fib working using cilk_fake.h using runtime only. One question I have is the requirement for a helper function for spawn since a cilkplus spawn is to call a function, basically, why we cannot inline the helper function. Is this for dealing with stack/frame pointer management or stack memory management (e.g. we cannot surround a spawn with a for/while loop if each spawn needs its own dedicated stack space)? any insightful comments on this for helping my curiosity ;-)

Thanks

yanyh

0 Kudos
3 Replies
Pablo_H_Intel
Employee
468 Views

Hi yanyh,

You have it exactly right.  The child function must run on a different stack from the continuation of the parent function if they are to run concurrently.  Therefore, it is critical that the spawn helper start a new stack frame so that, on a steal, there is a clean place to separate the two stacks.  In the Cilk's continuation-stealing scheduler, the child function always runs on the original stack frame, and a steal grafts a new stack onto the caller's stack frame, but the principle would be the same for a child-stealing scheduler.

Note that, although the spawn helper must not be inlined, there is no reason why the spawned function cannot be inlined into the spawn helper.

Pablo

0 Kudos
Yonghong_Y_
Beginner
468 Views

Hi Pablo,

Thanks for the quick reply. For continuation-stealing, since the steal grafts a new stack for the continuation, the stacks are already separated. The helper simply goes one step deeper in the call chain. so why that make such difference?

If I want to eliminate this constrains in the runtime, since this may bring a new feature that allows to spawn from any expressions/statements (like OpenMP task) without the needs of compiler to outline the body of task to a new function, how complicated will it be and where should I make changes (for the purpose of insane optimization ;-))?

Thanks

Yonghong

0 Kudos
Pablo_H_Intel
Employee
468 Views

The compiler does not allocate a new stack frame for inline functions.  Nor does it, in general, modify the stack pointer for nested blocks within a function.  Rather, the stack frame for all blocks and inline calls are allocated once in the function prologue. Any two variables that the compiler back-end determines cannot be alive simultaneously can be allocated in the same storage within the stack frame. Within a loop, the compiler relies on the assumption that one iteration completes before the next iteration starts.  It re-uses the same stack space for the body of all iterations. If a spawn were to occur within a loop and the spawn helper were inlined, then all of the invocations of the spawn helper would use the same stack space (allocated when the parent function started up) and would, therefore, race with one another.  It is not even necessary for the spawn to be in a loop or for there to be more than one spawn.  For example in the following code, the compiler could easily assume that it is OK to use the same storage for variable x as for the stack frame of the (inlined) spawn helper, again, resulting in a race between two concurrent uses of the same memory:

cilk_spawn xyz();
if (cond) {
    int x;
}

By outlining the spawn helper, the compiler forces each call to allocate its stack frame outside of the parent function's frame.

It is theoretically possible to inline some spawn helpers by modifying the aliveness computation for variables. Such an optimization might not be worth it, considering it requires changing the structure of the compiler to include additional communication between the front-end parser and the back-end code generator.

Pablo

0 Kudos
Reply