Software Archive
Read-only legacy content
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
17060 Discussions

Kernel Panic on MIC boot

oplehto
Beginner
953 Views

After an upgrade of a node from MPSS Gold Update 1 to Update 2 I have had issues with the frontend node in our cluster crashing on boot. I tried to downgrade back to Update 1 but it still keeps happening.

We have upgraded the compute nodes succesfully. They have identical hardware and a bridged network configuration. The frontend has the default configuration in /etc/sysconfig/mic.

The host OS is CentOS 6.3 and the card model is 5110P (B1)

On the host side we get the  following error during boot:

[plain]micscif_handle_lostnode 1250 node 1[/plain]

On the MIC I can see the following kernel panic during the early initialization:

[plain]

[    0.010000] SFI: Entering sfi_map_memory, phys = eefa0, size = 24
[    0.010000] SFI: sfi_map_table, th = ffff8800000eefa0
[    0.010000] SFI: Entering sfi_map_memory, phys = eefa0, size = 312
[    3.530141] i8042: Can't read CTR while initializing i8042
[    7.058807] Kernel panic - not syncing: Attempted to kill init!
[    7.058848] Pid: 1, comm: switch_root Tainted: G        W   2.6.38.8-g32944d0 #2
[    7.058875] Call Trace:
[    7.058912]  [<ffffffff8134e076>] ? panic+0x91/0x18c
[    7.058944]  [<ffffffff81036666>] ? do_exit+0x7b/0x768
[    7.058971]  [<ffffffff81036fcf>] ? do_group_exit+0x6c/0x9f
[    7.058997]  [<ffffffff81037019>] ? __wake_up_parent+0x0/0x28
[    7.059028]  [<ffffffff81002aab>] ? system_call_fastpath+0x16/0x1b
[    7.070124] mic_shutdown: system state 57005 dbreg 0x8000dead
[/plain]

0 Kudos
1 Reply
oplehto
Beginner
953 Views

I removed all the persistent MIC-related directories that were not cleaned with the RPM removal (/etc/sysconfig/mic, /opt/intel/mic) I also noticed that the ofed drivers were missing and installed them.

It seems that one of these two actions helped and the card boots again normally.

0 Kudos
Reply