Community
cancel
Showing results for 
Search instead for 
Did you mean: 
kfsone
New Contributor I
111 Views

Branch reduction?

Current C++ compiler under Linux with the following code and -S -O2 simpletest.cpp and pretty much any -march setting:
[cpp]extern int bar(int) ;

void fooSimple(int i) {
    int j = bar(0) ;
    if ( i == 3 ) ++j ;
    bar(j) ;
}

void fooOr(int i) {
    int j = bar(0) ;
    if ( i < 3 || i > 10 ) ++j ;
    bar(j) ;
}
[/cpp]
The "if" statements produce the following assembler.
fooSimple:
[plain]        cmpl      $3, 8(%esp)                                   #5.18
        lea       1(%eax), %edx                                 #5.18
        cmove     %edx, %eax                                    #5.18
[/plain]
Conditional move, branch avoidance
fooOr:
[bash]        movl      4(%esp), %edx                                 #11.2
        cmpl      $3, %edx                                      #11.11
        jl        ..B2.4        # Prob 50%                      #11.11
                                # LOE eax edx ebx ebp esi edi
..B2.3:                         # Preds ..B2.2
        cmpl      $10, %edx                                     #11.20
        jle       ..B2.5        # Prob 50%                      #11.20
                                # LOE eax ebx ebp esi edi
..B2.4:                         # Preds ..B2.3 ..B2.2
        incl      %eax                                          #11.27
                                # LOE eax ebx ebp esi edi
..B2.5:                         # Preds ..B2.4 ..B2.3
[/bash]
Two branches.
Can't one of these be eliminated? If the overhead of always calling lea 1(%eax), %edx is worth it for the single conditional case, why not for the double?
GCC uses a little trick for this code; it subtracts 3 from the a copy of i, and then tests it for being 8 or higher:
[cpp]    movl    $0, (%esp)
    call    _Z3bari

    subl    $3, %ebx
    cmpl    $8, %ebx
    sbbl    $-1, %eax
    movl    %eax, 8(%ebp)

    addl    $20, %esp
    popl    %ebx
    popl    %ebp
    jmp _Z3bari
[/cpp]
Any chance you could suggest a "compare range" instruction to the CPU guys? :)
0 Kudos
6 Replies
Dale_S_Intel
Employee
111 Views

Well, the most obvious problem that I see is that you don't know if it's safe to subtract 3 from i. There's a potential for underflow there, which I think would mess things up (I'll try to construct a test case to be sure). They seem to be doing some clever stuff with this case, but I don't know how useful it is in practice or in general. I'll try to fiddle with it a bit and see what happens.

Thanks!

Dale
jimdempseyatthecove
Black Belt
111 Views

Dale,

i is integer value, not integer reference. It is safe to perform the subtraction. Range test like this using overflow have been done since the very early days of compiler writing. At least as far back as I can remember (40+ years). The GCC varient of the code could be changed to use the conditional mov instruction too. This should be placed on your to-do list.

Jim
Dale_S_Intel
Employee
111 Views

Alright, Jim has managed to convince me that this works in all cases. I'll submit a performance issue on this and update you when I have more info.

Thanks for the test case!

Dale
Dale_S_Intel
Employee
111 Views

I've submitted CQ154461 on this issue.

Thanks!
Dale

kfsone
New Contributor I
111 Views

Can't beat getting your computer up and running again to be greeted by "performance ticket submitted" :)
Thanks Jim and Dale :)
JenniferJ
Moderator
111 Views

FYI. this issue has been fixed a while back in 13.1 and later. Hopefully you've tried the newer compiler.

Thanks again for letting us know.

Jennifer