Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Tail call optimization

subu
Beginner
334 Views
I'm running the C++ compiler on Debian amd64 with a 2.6 kernel. The compiler fails to tail optimize the following code:

/*---------------------------*/
#include

void foo() __attribute__((noinline));
void bar() __attribute__((noinline));

void bar() { printf("f() "); }
void foo() { bar(); }

int main(int argc, char *argv[])
{
foo();
return 0;
}
/*---------------------------*/

gcc 4.2 with -O3 generates the following assembly instructions for foo():

xor %eax,%eax
jmpq 4004a0

and the Intel compiler with -fast generates this:

push %rsi
callq 4002a0
pop %rcx
retq

Am I missing some compiler option here? Can someone please explain this to me?

Thank you.


0 Kudos
3 Replies
TimP
Honored Contributor III
334 Views
I might be missing something here too. In your example, the motivation for disabling the usual optimizations in both compilers by setting __attribute__((noinline)) aren't obvious. If the functions were too big for inline to work, it seems that tail call optimization wouldn't gain much, and could still hinder profiling. No doubt, more compelling cases, at least with tail recursion, could be set up, where special optimizations in gcc would look attractive.
0 Kudos
subu
Beginner
334 Views
The reason I used __attribute__((noinline)) was to emulate a C++ virtual function call which cannot be inlined. I did not use a C++ example in my first post in the interest of clarity and simplicity.
In my C++ tests, both compilers produce the same assembly listed in my first post. The gcc compiler tail optimizes and the Intel compiler does not. Are there any cases where the Intel compiler *does* tail optimize?
0 Kudos
Dale_S_Intel
Employee
334 Views
Thanks for bringing this to our attention, I've submitted an issue on this and will let you know when it's addressed.

Dale
0 Kudos
Reply