Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Branch table optimisation bug (mixed x86/x64, OS X 10.6)

jcornwall
Beginner
396 Views
Consider the following code:
[cpp]struct Tester {
  void test(int i) {
    // Five cases to trigger branch table optimisation.
    switch(i) {
      case 1: break;
      case 2: break;
      case 3: break;
      case 4: break;
      case 5: break;
    } 
  } 
};

int main() {
  Tester tester;
  tester.test(1);
}[/cpp]
When compiled on Mac OS X 10.6.3 with Intel C++ 11.1.088 Intel 64-bit with the command: 'icpc -O0 test.cpp', this produces a 64-bit binary. The binary runs correctly on a 64-bit kernel.

Attempting to run the 64-bit binary on a Mac OS X 10.6.3 32-bit kernel, or on Mac OS X 10.5.8, leads to a segmentation fault. The disassembly appears to implicate a branch table optimization on the switch statement:
[plain]0x0000000100000ee8 <_ZN6Tester4testEi+0>:	push   %rbp
0x0000000100000ee9 <_ZN6Tester4testEi+1>:	mov    %rsp,%rbp
0x0000000100000eec <_ZN6Tester4testEi+4>:	sub    $0x20,%rsp
0x0000000100000ef0 <_ZN6Tester4testEi+8>:	mov    %rdi,-0x20(%rbp)
0x0000000100000ef4 <_ZN6Tester4testEi+12>:	mov    %esi,-0x10(%rbp)
0x0000000100000ef7 <_ZN6Tester4testEi+15>:	mov    -0x10(%rbp),%eax
0x0000000100000efa <_ZN6Tester4testEi+18>:	mov    %eax,-0x18(%rbp)
0x0000000100000efd <_ZN6Tester4testEi+21>:	dec    %eax
0x0000000100000eff <_ZN6Tester4testEi+23>:	cmp    $0x4,%eax
0x0000000100000f02 <_ZN6Tester4testEi+26>:	ja     0x100000f1d <_ZN6Tester4testEi+53>
0x0000000100000f04 <_ZN6Tester4testEi+28>:	mov    -0x18(%rbp),%eax
0x0000000100000f07 <_ZN6Tester4testEi+31>:	dec    %eax
0x0000000100000f09 <_ZN6Tester4testEi+33>:	lea    -0x28(%rip),%rdx        # 0x100000ee8 <_ZN6Tester4testEi>
0x0000000100000f10 <_ZN6Tester4testEi+40>:	lea    -0x20(%rip),%rcx        # 0x100000ef7 <_ZN6Tester4testEi+15>
0x0000000100000f17 <_ZN6Tester4testEi+47>:	add    (%rdx,%rax,8),%rcx
0x0000000100000f1b <_ZN6Tester4testEi+51>:	jmpq   *%rcx[/plain]
The final jump instruction refers to an invalid location. I presume this is the branch following the table lookup. However, the (relative + rcx) target appears to be computed by dereferencing the code segment at 0x100000ee8 (or, in this example, 0x100000ee8 + 8): not the branch table?
If I remove a single case from the switch statement, then the assembly structure changes and no crash is observed. No crash is observed on ICC 10.1.024 with the original code, which does appear to find the branch table correctly:
[plain]0x0000000100000dc3 <_ZN6Tester4testEi+35>:	lea    0x10(%rip),%rdx        # 0x100000dda <_ZN6Tester4testEi+58>
0x0000000100000dca <_ZN6Tester4testEi+42>:	lea    -0x22(%rip),%rcx        # 0x100000daf <_ZN6Tester4testEi+15>
0x0000000100000dd1 <_ZN6Tester4testEi+49>:	add    (%rdx,%rax,8),%rcx
0x0000000100000dd5 <_ZN6Tester4testEi+53>:	jmpq   *%rcx
0x0000000100000dd7 <_ZN6Tester4testEi+55>:	leaveq 
0x0000000100000dd8 <_ZN6Tester4testEi+56>:	retq   
0x0000000100000dd9 <_ZN6Tester4testEi+57>:	nop    
0x0000000100000dda <_ZN6Tester4testEi+58>:	sub    %al,(%rax)[/plain]
Note the forward reference to 0x100000dda, which is the beginning of data and presumably encodes the branch table.
0 Kudos
8 Replies
Quoc-An_L_Intel
Moderator
396 Views
Does it work with the GNU compiler?
0 Kudos
jcornwall
Beginner
396 Views
Yes. And, as already noted, the correct code is produced by the 10.1 Intel compiler.
0 Kudos
Quoc-An_L_Intel
Moderator
396 Views
Thanks, I'll reply to this thread after our investigation.
0 Kudos
jimdempseyatthecove
Honored Contributor III
396 Views
Qale,

The index range check is using eax (32-bit)
eax is diddled with following range check
The final jump has SIB with rax (64-bit)
I think the eax diddling bunged up the high order 32-bits of rax.

Jim Demspey

0 Kudos
Quoc-An_L_Intel
Moderator
396 Views
10.1 compiler is not supported on OSX 10.6.x. Were you using an older linker on OSX 10.5.x and Xcode 3.1.x?

The problem you are seeing looks like an issue with MacOS 10.6.3 and the linker from Xcode 3.2.x.

The same object files, being linked with a previous linker (from 10.5.8) works fine on any version of macOS, including 10.6.3. We are discussion this issue with Apple engineer to understand the problem.

I'll post an update once we have a resolution.

0 Kudos
Quoc-An_L_Intel
Moderator
396 Views
As a work around, try using the option -use-asm.

The problem is in the linker from xcode3.2.2 which has a bug with resolving relocations for symbols with names that start with 'L'.

By convention assembler suppresses all symbols with name starts with 'L' so when using gcc or (icc with -use_asm option) the linker will not see these symbols.

The fix for this problem will be available in a future update to the compiler product.

0 Kudos
Quoc-An_L_Intel
Moderator
396 Views
Current status, this problem with the 11.1 compiler is a result of a bug in the linker that comes with Xcode 3.2.2, 3.2.3, 3.2.4.

The work arounds are:

1) Use Xcode 3.2.1 with 11.1 compiler.
2) Use 11.1 compiler with the option -use-asm with Xcode 3.2.2, 3.2.3, 3.2.4.
  • It should fix most cases but there are some cases when even generating object file through external assembler L* symbols still may appear in object file. Those cases are usually constant string literals placed in cstring section
3) Use Intel Composer XE.


We are not planning to work around this linker bug in the 11.1 compiler due to the complexity of the workaround which might reduce the stability of the 11.1 product.



0 Kudos
Quoc-An_L_Intel
Moderator
396 Views

Xcode 3.2.5 was released last week.This version should have the linker fixes for several linker bugs seen in previous version as noted in this thread.

The recommendation is to use Xcode 3.2.5 with Intel compiler.

0 Kudos
Reply