I'm working on a robot powered by an Edison, and I'm getting a kernal panic every so often which is making it really unusable. I've never dealt with a kernal panic and I'm not really sure where to start.
It's very hard to reproduce, but my code is available here https://github.com/TeamCharizard/RBE2002 TeamCharizard/RBE2002 · GitHub. Essentially, sometimes when I run my code I will get a kernal panic and all lose connection and control over the robot. A crashlog is generated and I get the following printed out:
[ 317.760261] BUG: unable to handle kernel paging request at 89892636
[ 317.760373] IP:  kmem_cache_alloc_trace+0x8d/0x1a0
[ 317.760463] *pdpt = 00000000350da001 *pde = 0000000000000000
[ 317.760542] Oops: 0000 [# 1] PREEMPT SMP
[ 317.760610] Modules linked in: usb_f_acm u_serial g_multi libcomposite bcm_bt_lpm bcm4334x(O)
[ 317.760745] CPU: 1 PID: 528 Comm: teleop Tainted: G O 3.10.17-poky-edison+ # 1
[ 317.760827] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
[ 317.760915] task: f5e6b7a0 ti: f6c96000 task.ti: f6c96000
[ 317.760978] EIP: 0060: EFLAGS: 00010286 CPU: 1
[ 317.761043] EIP is at kmem_cache_alloc_trace+0x8d/0x1a0
[ 317.761103] EAX: 00000000 EBX: f73fe500 ECX: 000001dd EDX: 000001db
[ 317.761170] ESI: 89892636 EDI: f6c01a00 EBP: f6c97d20 ESP: f6c97cf4
[ 317.761237] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 317.761298] CR0: 8005003b CR2: 89892636 CR3: 35016000 CR4: 001007f0
[ 317.761364] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 317.761429] DR6: ffff0ff0 DR7: 00000400
[ 317.761472] Stack:
[ 317.761503] b1998fe6 f6c96000 f6c97d08 c15596c7 00000800 f6c96000 000080d0 89892636
[ 317.761643] c1ddba80 f6c15120 c1ddba80 f6c97d40 c15596c7 c14d8bce f6c97d50 c15523b0
[ 317.761781] c1ddba80 00000001 c1ddba80 f6c97d50 c155a019 c1ddbfc1 00000001 f6c97d84
[ 317.761918] Call Trace:
[ 317.761976]  ? dma_init_common+0x27/0x130
[ 317.762050]  dma_init_common+0x27/0x130
[ 317.762120]  ? __const_udelay+0x1e/0x20
[ 317.762187]  ? set_mctrl+0xa0/0x1d0
[ 317.762253]  intel_dma_init+0x19/0x70
[ 317.762320]  serial_hsu_startup+0x31c/0xb30
[ 317.762392]  ? sub_preempt_count+0x95/0xf0
[ 317.762467]  uart_startup.part.8+0x4a/0x1c0
[ 317.762537]  ? hsu_show_regs_open+0x20/0x20
[ 317.762608]  uart_open+0xfa/0x1c0
[ 317.762673]  ? check_tty_count+0x43/0xb0
[ 317.762745]  tty_open+0x16c/0x5e0
[ 317.762813]  chrdev_open+0x77/0x180
[ 317.762878]  ? cdev_put+0x20/0x20
[ 317.762942]  do_dentry_open+0x1d4/0x270
[ 317.763009]  ? cdev_put+0x20/0x20
[ 317.763074]  finish_open+0x2b/0x50
[ 317.763141]  do_last+0x49a/0xd50
[ 317.763212]  path_openat+0xa1/0x3f0
[ 317.763283]  ? get_parent_ip+0xb/0x40
[ 317.763355]  do_filp_open+0x2b/0x90
[ 317.763431]  do_sys_open+0xe5/0x1c0
[ 317.763501]  SyS_open+0x22/0x30
[ 317.763566]  syscall_call+0x7/0xb
[ 317.763622] Code: da 27 5c 00 8b 45 e8 8b 55 f0 8b 40 08 a8 08 0f 85 d9 00 00 00 8b 03 85 c0 89 45 f0 0f 84 dc 00 00 00 8b 47 14 8b 75 f0 8d 4a 02 <8b> 1c 06 8b 37 8b 45 f0 64 0f c7 0e 0f 94 c0 84 c0 74 a3 03 5f
[ 317.764302] EIP:  kmem_cache_alloc_trace+0x8d/0x1a0 SS:ESP 0068:f6c97cf4
[ 317.764402] CR2: 0000000089892636
[ 317.764452] ---[ end trace 991789f17125e322 ]---
I'll attach a crashlog tarball when I can get my hands on one.... any ideas what's going on? It's saying stuff about uart
We'd like to try and reproduce the issue. I haven't seen this error before. Can you provide as many details as possible such as image, external circuitry, additional hardware, commands needed to reproduce, connections, what guides are you following etc?
Well, I have several other crash dumps which you can view here. It seems to be unrelated to UART. It will happen when I run programs, in the middle of running programs, or even just trying to SSH in! It's incredibly hard to reproduce, but it does occur frequently.
Info on the crashes are here
https://gist.github.com/PeterMitrano/a61c9602890ca589693f a crash log from intel edsion · GitHub
Other info: The edison does not reboot, it just prints out this info. If it occurs during an SSH session, that SSH session dies and I have to kill the program over serial.
There are no crash tarballs created either.
I also check free-memory, and it seems to have almost 900mb free, and we're only using a 1gb of the flash storage. This occurs even when no external circuitry is attached.
I reflashed the edison, and it still occurs. So, either there's a bug in the firmware or my physical edison is somehow broken.
Anyone have debugging advice? I want to know if my edison is broken or if it's just the firmware.
What programs are you running, are you using any IDEs?
Do you have access to the Linux console or right after you SSH in you receive the error?
Is this error displayed after running a specific command or does it appear automatically?
How are you powering and connecting your Edison to your computer, are you using a mini breakout board or an Arduino expansion board?
Have you always received this error or did it stated to happen after you did some changes or modifications, if so, which ones? Any guidance on how to reproduce this issue?
I haven't been using the Edison since. I've left it running headless, without our arduino shield, it hasn't happened for a few days.
My finals are next week, but over winter break I'll try to figure out how to reproduce it. Sorry I can't do much right now!