The Software Developer's Manual, and the corresponding AMD document, indicate that after the new RIP is calculated, it is then truncated to whatever the instruction's operand size is.
To see if this was actually true, I assembled a JMP instruction with a 66 prefix, to set an operand size of 16 bits. I would expect this to jump to a 16-bit address.
Running this instruction on my AMD Steamroller CPU, I got a segmentation fault.
But running it with SDE, the trace shows a jump without truncating the destination address.
It would appear that SDE is incorrect.
Here is the assembler code
data16 jmp y # This instruction seg faults on AMD processor
Here is the trace from SDE
TID0: INS 0x0000000100401000 BASE jmp 0x100401002
TID0: INS 0x0000000100401002 BASE jmp 0x100401005
TID0: Read 0x76e15a4d = *(UINT64*)000000000022FF58
TID0: INS 0x0000000100401005 BASE ret | rsp = 0x22ff60
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
Here is objdump of the code
0: eb 00 jmp 2 <x>
2: 66 eb 00 data16 jmp 5 <y>
5: c3 ret
I've also verified the same behavior for Jcc.
For JRCXZ, however, both docs say the jump address is truncated, but on my AMD processor, I do not get the segmentation fault. I have verified that the branch is actually taken.
I haven't tried CALL, nor a JMP with a 32-bit displacement.
This is not an SDE bug. There are some historic ISA differences in implementations between Intel and AMD processors. RIP-handling for 66-prefixed near branches in 64b mode is one of the areas of difference. FWIW, your code runs fine on the Intel processors.