Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Branch reduction?

kfsone
New Contributor I
668 Views
Current C++ compiler under Linux with the following code and -S -O2 simpletest.cpp and pretty much any -march setting:
[cpp]extern int bar(int) ;

void fooSimple(int i) {
    int j = bar(0) ;
    if ( i == 3 ) ++j ;
    bar(j) ;
}

void fooOr(int i) {
    int j = bar(0) ;
    if ( i < 3 || i > 10 ) ++j ;
    bar(j) ;
}
[/cpp]
The "if" statements produce the following assembler.
fooSimple:
[plain]        cmpl      $3, 8(%esp)                                   #5.18
        lea       1(%eax), %edx                                 #5.18
        cmove     %edx, %eax                                    #5.18
[/plain]
Conditional move, branch avoidance
fooOr:
[bash]        movl      4(%esp), %edx                                 #11.2
        cmpl      $3, %edx                                      #11.11
        jl        ..B2.4        # Prob 50%                      #11.11
                                # LOE eax edx ebx ebp esi edi
..B2.3:                         # Preds ..B2.2
        cmpl      $10, %edx                                     #11.20
        jle       ..B2.5        # Prob 50%                      #11.20
                                # LOE eax ebx ebp esi edi
..B2.4:                         # Preds ..B2.3 ..B2.2
        incl      %eax                                          #11.27
                                # LOE eax ebx ebp esi edi
..B2.5:                         # Preds ..B2.4 ..B2.3
[/bash]
Two branches.
Can't one of these be eliminated? If the overhead of always calling lea 1(%eax), %edx is worth it for the single conditional case, why not for the double?
GCC uses a little trick for this code; it subtracts 3 from the a copy of i, and then tests it for being 8 or higher:
[cpp]    movl    $0, (%esp)
    call    _Z3bari

    subl    $3, %ebx
    cmpl    $8, %ebx
    sbbl    $-1, %eax
    movl    %eax, 8(%ebp)

    addl    $20, %esp
    popl    %ebx
    popl    %ebp
    jmp _Z3bari
[/cpp]
Any chance you could suggest a "compare range" instruction to the CPU guys? :)
0 Kudos
6 Replies
Dale_S_Intel
Employee
668 Views
Well, the most obvious problem that I see is that you don't know if it's safe to subtract 3 from i. There's a potential for underflow there, which I think would mess things up (I'll try to construct a test case to be sure). They seem to be doing some clever stuff with this case, but I don't know how useful it is in practice or in general. I'll try to fiddle with it a bit and see what happens.

Thanks!

Dale
0 Kudos
jimdempseyatthecove
Honored Contributor III
668 Views
Dale,

i is integer value, not integer reference. It is safe to perform the subtraction. Range test like this using overflow have been done since the very early days of compiler writing. At least as far back as I can remember (40+ years). The GCC varient of the code could be changed to use the conditional mov instruction too. This should be placed on your to-do list.

Jim
0 Kudos
Dale_S_Intel
Employee
668 Views
Alright, Jim has managed to convince me that this works in all cases. I'll submit a performance issue on this and update you when I have more info.

Thanks for the test case!

Dale
0 Kudos
Dale_S_Intel
Employee
668 Views
I've submitted CQ154461 on this issue.

Thanks!
Dale

0 Kudos
kfsone
New Contributor I
668 Views
Can't beat getting your computer up and running again to be greeted by "performance ticket submitted" :)
Thanks Jim and Dale :)
0 Kudos
JenniferJ
Moderator
668 Views

FYI. this issue has been fixed a while back in 13.1 and later. Hopefully you've tried the newer compiler.

Thanks again for letting us know.

Jennifer

0 Kudos
Reply