Processors
Intel® Processors, Tools, and Utilities
14503 Discussions

RSP register is shifted 8 bytes in rarely

DAI
Beginner
802 Views

We are a vendor providing network products to Japanese telecommunications carriers.

We provide equipment to aggregate each end-user in a corporate circuit that covers all of Japan.

We use Intel SoC such as Denverton for our products.
Our products that used Intel SoCs are manufactured by our ODM partners.

I am posting this question on behalf of our company.

 

Description:

When a certain code is executed, the address indicated by RSP register is
shifted by 8 bytes in very rare cases. And then, our program is aborted by
stack guard protector.

At least there appears to be no problem with the source code. Also, as far as I
could see from the disassembled results, I could not find anything that would
cause the RSP to shift. Also, the code is usually executed without any problems.

The contents of the stack frame are consistent and no overflow is occurring.

The most reasonable explanation for the phenomenon is that the RSP register
cannot be updated when retq or pop instruction. Is such a thing possible?
If so, could you please tell us how to investigate?


Prerequisites:

- CPU: Intel Atom CPU C3508
- Microcode: sig=0x506f1, pf=0x1, revision=0x32
- UEFI or Legacy BIOS: UEFI
- Linux Kernel: 4.19.171
- OS: Debian GNU/Linux 10
- GCC: 8.3.0
- Kernel/Userland apps: no use 32bit/i386 software
- Reproducible?: yes, but occurs only rarely
- Machine dependensy: unknown, reproduced on multiple machines, but its PC register is different
- Link pthread in crashed program?: yes
- Use pthread in crashed program?: no


Backtraces:

This app was built with '-fstack-protector'.

The rarely shifted problem is occured in other code. In this report, we will
discuss the backtracing of the most frequently occurring areas.

>>> bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f83163b2535 in __GI_abort () at abort.c:79
#2 0x00007f8316409508 in __libc_message (action=<optimized out>, fmt=fmt@entry=0x7f831651407b "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:181
#3 0x00007f831649a80d in __GI___fortify_fail_abort (need_backtrace=need_backtrace@entry=false, msg=msg@entry=0x7f8316514059 "stack smashing detected") at fortify_fail.c:28
#4 0x00007f831649a7c2 in __stack_chk_fail () at stack_chk_fail.c:29
#5 0x00007f83167f834b in aplog_conf_find (filter=0x7f83167f79a0 <filter>, arg=0x7ffff8633890) at aplog_conf.c:153

>>> f 5
>>> disas
Dump of assembler code for function aplog_conf_find:
0x00007f83167f8250 <+0>: push %r14
(snip)
0x00007f83167f82d3 <+131>: mov %r13,%rdi
0x00007f83167f82d6 <+134>: callq 0x7f83167f7130 <confent_next@plt>
0x00007f83167f82db <+139>: test %eax,%eax <= Value of RSP pointed memory, confent_next@plt is called??
0x00007f83167f82dd <+141>: jne 0x7f83167f82b0 <aplog_conf_find+96>
0x00007f83167f82df <+143>: xor %ebx,%ebx
0x00007f83167f82e1 <+145>: mov 0x28(%rsp),%rcx <= load stack guard
0x00007f83167f82e6 <+150>: xor %fs:0x28,%rcx <= check stack guard
0x00007f83167f82ef <+159>: mov %rbx,%rax
0x00007f83167f82f2 <+162>: jne 0x7f83167f8346 <aplog_conf_find+246> <= failed check stack guard
0x00007f83167f82f4 <+164>: add $0x30,%rsp
0x00007f83167f82f8 <+168>: pop %rbx
0x00007f83167f82f9 <+169>: pop %rbp
0x00007f83167f82fa <+170>: pop %r12
0x00007f83167f82fc <+172>: pop %r13
0x00007f83167f82fe <+174>: pop %r14
0x00007f83167f8300 <+176>: retq
0x00007f83167f8301 <+177>: nopl 0x0(%rax)
0x00007f83167f8308 <+184>: callq 0x7f83167f72f0 <aplog_conf_update@plt>
0x00007f83167f830d <+189>: jmpq 0x7f83167f827f <aplog_conf_find+47>
0x00007f83167f8312 <+194>: callq 0x7f83167f7080 <__errno_location@plt>
0x00007f83167f8317 <+199>: xor %ebx,%ebx
0x00007f83167f8319 <+201>: mov (%rax),%edi
0x00007f83167f831b <+203>: callq 0x7f83167f7380 <strerror@plt>
0x00007f83167f8320 <+208>: lea 0x1df1(%rip),%rcx # 0x7f83167fa118
0x00007f83167f8327 <+215>: mov $0x2,%edi
0x00007f83167f832c <+220>: lea 0x1e7d(%rip),%rdx # 0x7f83167fa1b0 <__func__.4720>
0x00007f83167f8333 <+227>: mov %rax,%r8
0x00007f83167f8336 <+230>: lea 0x1da1(%rip),%rsi # 0x7f83167fa0de
0x00007f83167f833d <+237>: xor %eax,%eax
0x00007f83167f833f <+239>: callq 0x7f83167f72b0 <aplog_printf@plt>
0x00007f83167f8344 <+244>: jmp 0x7f83167f82e1 <aplog_conf_find+145>
0x00007f83167f8346 <+246>: callq 0x7f83167f7120 <__stack_chk_fail@plt> <= will call abort()
End of assembler dump.

>>> p/x $rsp
$1 = 0x7ffff8633808

>>> x/32xg $rsp-0x20
0x7ffff86337e8: 0x00007f83167fd260 0x00007f831649a7c2
0x7ffff86337f8: 0x000000000000000f 0x00007f83167f834b
0x7ffff8633808: 0x00007f83167f82db 0x0000001400000013
^- current RSP ^- expected RSP???
0x7ffff8633818: 0x00007f83ffffffff 0x00007f83167fd290
0x7ffff8633828: 0x00007f83167fd2b0 0x00007f83167f79a0
^- 0x28(rsp) for current RSP
0x7ffff8633838: 0x1ddb8024f76e9600 0x0000000000000000
^- 0x28(rsp) of expected RSP = stack guard value => RSP is shifted???
0x7ffff8633848: 0x00007f831622c6b8 0x0000000000000000
0x7ffff8633858: 0x00007f8316a79427 0x00007f8316a7cf38
0x7ffff8633868: 0x00007f83167f7bb6 0x00007f83167f79a0
0x7ffff8633878: 0x00007f8316a7c600 0x00007ffff8633890
0x7ffff8633888: 0x00007f830000000d 0x0004030200000000
0x7ffff8633898: 0x00007f8316a79427 0x00007f8316a7cf38
0x7ffff86338a8: 0x1ddb8024f76e9600 0x0000000000000000
0x7ffff86338b8: 0x00007f831622c6b8 0x0000000000000000
0x7ffff86338c8: 0x1ddb8024f76e9600 0x00007f8316a7cf70
0x7ffff86338d8: 0x00007f83167f7cf6 0x00007ffff86338f0


Details:

All disassembles of this error occured function.

>>> disas
Dump of assembler code for function aplog_conf_find:
0x00007f83167f8250 <+0>: push %r14
0x00007f83167f8252 <+2>: mov %rsi,%r14
0x00007f83167f8255 <+5>: push %r13
0x00007f83167f8257 <+7>: push %r12
0x00007f83167f8259 <+9>: push %rbp
0x00007f83167f825a <+10>: mov %rdi,%rbp
0x00007f83167f825d <+13>: push %rbx
0x00007f83167f825e <+14>: sub $0x30,%rsp
0x00007f83167f8262 <+18>: mov %fs:0x28,%rax
0x00007f83167f826b <+27>: mov %rax,0x28(%rsp)
0x00007f83167f8270 <+32>: xor %eax,%eax
0x00007f83167f8272 <+34>: callq 0x7f83167f7210 <aplog_conf_isupdate@plt>
0x00007f83167f8277 <+39>: test %eax,%eax
0x00007f83167f8279 <+41>: jne 0x7f83167f8308 <aplog_conf_find+184>
0x00007f83167f827f <+47>: xor %esi,%esi
0x00007f83167f8281 <+49>: mov $0x2,%edi
0x00007f83167f8286 <+54>: callq 0x7f83167f7320 <aplog_strage_get@plt>
0x00007f83167f828b <+59>: mov %rax,%r12
0x00007f83167f828e <+62>: test %rax,%rax
0x00007f83167f8291 <+65>: je 0x7f83167f8312 <aplog_conf_find+194>
0x00007f83167f8293 <+67>: mov %rsp,%r13 <= save RSP to R13 register
0x00007f83167f8296 <+70>: mov 0x18(%rax),%rsi
0x00007f83167f829a <+74>: xor %edx,%edx
0x00007f83167f829c <+76>: mov %r13,%rdi
0x00007f83167f829f <+79>: callq 0x7f83167f7150 <confent_iter_init@plt>
0x00007f83167f82a4 <+84>: jmp 0x7f83167f82d3 <aplog_conf_find+131>
0x00007f83167f82a6 <+86>: nopw %cs:0x0(%rax,%rax,1)
0x00007f83167f82b0 <+96>: mov 0x18(%r12),%rdi
0x00007f83167f82b5 <+101>: mov 0x4(%rsp),%esi
0x00007f83167f82b9 <+105>: callq 0x7f83167f7090 <confent_idx2addr@plt>
0x00007f83167f82be <+110>: lea 0x8(%rax),%rbx
0x00007f83167f82c2 <+114>: test %rbp,%rbp
0x00007f83167f82c5 <+117>: je 0x7f83167f82e1 <aplog_conf_find+145>
0x00007f83167f82c7 <+119>: mov %r14,%rsi
0x00007f83167f82ca <+122>: mov %rbx,%rdi
0x00007f83167f82cd <+125>: callq *%rbp
0x00007f83167f82cf <+127>: test %eax,%eax
0x00007f83167f82d1 <+129>: jne 0x7f83167f82e1 <aplog_conf_find+145>
0x00007f83167f82d3 <+131>: mov %r13,%rdi
0x00007f83167f82d6 <+134>: callq 0x7f83167f7130 <confent_next@plt>
0x00007f83167f82db <+139>: test %eax,%eax <= The address of RSP-1, confent_next@plt is called??
0x00007f83167f82dd <+141>: jne 0x7f83167f82b0 <aplog_conf_find+96>
0x00007f83167f82df <+143>: xor %ebx,%ebx
0x00007f83167f82e1 <+145>: mov 0x28(%rsp),%rcx <= load stack guard
0x00007f83167f82e6 <+150>: xor %fs:0x28,%rcx <= check stack guard
0x00007f83167f82ef <+159>: mov %rbx,%rax
0x00007f83167f82f2 <+162>: jne 0x7f83167f8346 <aplog_conf_find+246> <= failed check stack guard
0x00007f83167f82f4 <+164>: add $0x30,%rsp
0x00007f83167f82f8 <+168>: pop %rbx
0x00007f83167f82f9 <+169>: pop %rbp
0x00007f83167f82fa <+170>: pop %r12
0x00007f83167f82fc <+172>: pop %r13
0x00007f83167f82fe <+174>: pop %r14
0x00007f83167f8300 <+176>: retq
0x00007f83167f8301 <+177>: nopl 0x0(%rax)
0x00007f83167f8308 <+184>: callq 0x7f83167f72f0 <aplog_conf_update@plt>
0x00007f83167f830d <+189>: jmpq 0x7f83167f827f <aplog_conf_find+47>
0x00007f83167f8312 <+194>: callq 0x7f83167f7080 <__errno_location@plt>
0x00007f83167f8317 <+199>: xor %ebx,%ebx
0x00007f83167f8319 <+201>: mov (%rax),%edi
0x00007f83167f831b <+203>: callq 0x7f83167f7380 <strerror@plt>
0x00007f83167f8320 <+208>: lea 0x1df1(%rip),%rcx # 0x7f83167fa118
0x00007f83167f8327 <+215>: mov $0x2,%edi
0x00007f83167f832c <+220>: lea 0x1e7d(%rip),%rdx # 0x7f83167fa1b0 <__func__.4720>
0x00007f83167f8333 <+227>: mov %rax,%r8
0x00007f83167f8336 <+230>: lea 0x1da1(%rip),%rsi # 0x7f83167fa0de
0x00007f83167f833d <+237>: xor %eax,%eax
0x00007f83167f833f <+239>: callq 0x7f83167f72b0 <aplog_printf@plt>
0x00007f83167f8344 <+244>: jmp 0x7f83167f82e1 <aplog_conf_find+145>
0x00007f83167f8346 <+246>: callq 0x7f83167f7120 <__stack_chk_fail@plt> <= will call abort()
End of assembler dump.

>>> disas 0x7f83167f7130
Dump of assembler code for function confent_next@plt:
0x00007f83167f7130 <+0>: jmpq *0x5f62(%rip) # 0x7f83167fd098 <confent_next@got.plt>
0x00007f83167f7136 <+6>: pushq $0x10
0x00007f83167f713b <+11>: jmpq 0x7f83167f7020
End of assembler dump.
>>> x/1xg 0x7f83167fd098
0x7f83167fd098 <confent_next@got.plt>: 0x00007f83167f9140
>>> disas 0x00007f83167f9140
Dump of assembler code for function confent_next:
0x00007f83167f9140 <+0>: push %rbp
0x00007f83167f9141 <+1>: push %rbx
0x00007f83167f9142 <+2>: sub $0x8,%rsp
0x00007f83167f9146 <+6>: cmpl $0xffffffff,0x4(%rdi)
0x00007f83167f914a <+10>: je 0x7f83167f9160 <confent_next+32>
0x00007f83167f914c <+12>: mov 0x8(%rdi),%ebp
0x00007f83167f914f <+15>: xor %eax,%eax
0x00007f83167f9151 <+17>: cmp $0xffffffff,%ebp
0x00007f83167f9154 <+20>: jne 0x7f83167f916d <confent_next+45>
0x00007f83167f9156 <+22>: add $0x8,%rsp
0x00007f83167f915a <+26>: pop %rbx
0x00007f83167f915b <+27>: pop %rbp
0x00007f83167f915c <+28>: retq
0x00007f83167f915d <+29>: nopl (%rax)
0x00007f83167f9160 <+32>: mov 0x18(%rdi),%rax
0x00007f83167f9164 <+36>: mov (%rax),%ebp
0x00007f83167f9166 <+38>: xor %eax,%eax
0x00007f83167f9168 <+40>: cmp $0xffffffff,%ebp
0x00007f83167f916b <+43>: je 0x7f83167f9156 <confent_next+22>
0x00007f83167f916d <+45>: mov %rdi,%rbx
0x00007f83167f9170 <+48>: mov 0x10(%rdi),%rdi
0x00007f83167f9174 <+52>: mov %ebp,%esi
0x00007f83167f9176 <+54>: callq 0x7f83167f7090 <confent_idx2addr@plt>
0x00007f83167f917b <+59>: mov %ebp,0x4(%rbx)
0x00007f83167f917e <+62>: mov (%rax),%edx
0x00007f83167f9180 <+64>: mov 0x4(%rax),%eax
0x00007f83167f9183 <+67>: mov %edx,(%rbx)
0x00007f83167f9185 <+69>: mov %eax,0x8(%rbx)
0x00007f83167f9188 <+72>: add $0x8,%rsp
0x00007f83167f918c <+76>: mov $0x1,%eax
0x00007f83167f9191 <+81>: pop %rbx
0x00007f83167f9192 <+82>: pop %rbp
0x00007f83167f9193 <+83>: retq
End of assembler dump.
>>> disas 0x7f83167f7090
Dump of assembler code for function confent_idx2addr@plt:
0x00007f83167f7090 <+0>: jmpq *0x5fb2(%rip) # 0x7f83167fd048 <confent_idx2addr@got.plt>
0x00007f83167f7096 <+6>: pushq $0x6
0x00007f83167f709b <+11>: jmpq 0x7f83167f7020
End of assembler dump.
>>> disas 0x00007f83167f90e0
Dump of assembler code for function confent_idx2addr:
0x00007f83167f90e0 <+0>: mov 0x14(%rdi),%eax
0x00007f83167f90e3 <+3>: movslq %esi,%rsi
0x00007f83167f90e6 <+6>: add $0x8,%rax
0x00007f83167f90ea <+10>: imul %rsi,%rax
0x00007f83167f90ee <+14>: lea 0x30(%rdi,%rax,1),%rax
0x00007f83167f90f3 <+19>: retq
End of assembler dump.


This issue appears to be in the following vein.

* 0x00007f83167f8293: save RSP to R13 register
* 0x00007f83167f82d6: call confent_next@plt and save next PC to (RSP-1)
* 0x00007f83167f82db: return from confent_next@plt, RIP is shifted???
* 0x00007f83167f82e1: load stack guard, already RIP is shifted
* 0x00007f83167f82f2: failed check stack guard, call abort()

R13 register still had the correct RSP register value.

>>> p/x $r13
$4 = 0x7ffff8633810 => valid RSP
>>> p/x $rsp
$5 = 0x7ffff8633808 => shifted RSP

If pop/push instruction have a problem, it would SEGV at retq instruction.
Therefore, we assume that push/pop have not a problem. Then the only thing that
seems to be involved is the retq instruction.

Labels (1)
0 Kudos
2 Replies
ouyxy
Beginner
502 Views

any progress? IS this issue fixed ?  

Thanks

0 Kudos
DAI
Beginner
89 Views

This issue has been fixed by BIOS patch provided by Intel.

0 Kudos
Reply