- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I have a problem with running X710-DA2 on my servers. When I try to load the i40e driver it crashes. It happened on a stock fw, drivers, etc. and on the upgraded versions too.
Platform: Supermicro X9DRW with dual Intel(R) Xeon(R) CPU E5-2620, latest BIOS
OS: Ubuntu 16.04.1 LTS, linux 4.4.0-57
NIC firmware: fw 5.0.40043 api 1.5 nvm 5.04 0x800024c6 0.0.0 (latest)
i40e driver: 1.5.25 (latest, downloaded and compiled)
Modules installed: GBC Photonics SP-MM85030D-GP -SFP+
dmesg:
Jan 3 18:01:58 ceph6 kernel: [ 739.510036] i40e: Intel(R) 40-10 Gigabit Ethernet Connection Network Driver - version 1.5.25
Jan 3 18:01:58 ceph6 kernel: [ 739.510041] i40e: Copyright(c) 2013 - 2016 Intel Corporation.
Jan 3 18:01:58 ceph6 kernel: [ 739.527324] i40e 0000:04:00.0: fw 5.0.40043 api 1.5 nvm 5.04 0x800024c6 0.0.0
Jan 3 18:01:58 ceph6 kernel: [ 739.765165] i40e 0000:04:00.0: MAC address: 3c:fd:fe:a2:19:54
Jan 3 18:01:58 ceph6 kernel: [ 739.789909] i40e 0000:04:00.0: AQ command Config VSI BW allocation per TC failed = 14
Jan 3 18:01:58 ceph6 kernel: [ 739.789912] i40e 0000:04:00.0: Failed configuring TC map 255 for VSI 390
Jan 3 18:01:58 ceph6 kernel: [ 739.789915] i40e 0000:04:00.0: failed to configure TCs for main VSI tc_map 0x000000ff, err I40E_ERR_INVALID_QP_ID aq_err I40E_AQ_RC_EINVAL
Jan 3 18:01:59 ceph6 kernel: [ 739.833189] divide error: 0000 [# 1] SMP
Jan 3 18:01:59 ceph6 kernel: [ 739.833324] Modules linked in: i40e(OE+) vxlan ip6_udp_tunnel udp_tunnel intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni
_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper input_leds joydev sb_edac cryptd serio_raw edac_core ipmi_si mei_me 8250_fintek mei ipmi_msghandler shpchp ioatdma lpc_ich mac_hid autofs4 hid_generic usbhid hid psmouse isci
igb ahci libsas libahci dca ptp scsi_transport_sas megaraid_sas pps_core i2c_algo_bit wmi fjes
Jan 3 18:01:59 ceph6 kernel: [ 739.835034] CPU: 0 PID: 2386 Comm: insmod Tainted: G OE 4.4.0-57-generic # 78-Ubuntu
Jan 3 18:01:59 ceph6 kernel: [ 739.835306] Hardware name: Supermicro X9DRW/X9DRW, BIOS 3.0c 03/24/2014
Jan 3 18:01:59 ceph6 kernel: [ 739.835518] task: ffff880868b9f000 ti: ffff88046c1c0000 task.ti: ffff88046c1c0000
Jan 3 18:01:59 ceph6 kernel: [ 739.835754] RIP: 0010:[] [] i40e_pf_config_rss+0x1ef/0x230 [i40e]
Jan 3 18:01:59 ceph6 kernel: [ 739.836059] RSP: 0018:ffff88046c1c37a0 EFLAGS: 00010246
Jan 3 18:01:59 ceph6 kernel: [ 739.836227] RAX: 0000000000000000 RBX: ffff88086bd33c00 RCX: 0000000000000000
Jan 3 18:01:59 ceph6 kernel: [ 739.836452] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000200
Jan 3 18:01:59 ceph6 kernel: [ 739.836679] RBP: ffff88046c1c3808 R08: ffff88046fc1a120 R09: ffff88046f8032c0
Jan 3 18:01:59 ceph6 kernel: [ 739.836904] R10: ffff88086bd33c00 R11: 0000000000000000 R12: 0000000000000000
Jan 3 18:01:59 ceph6 kernel: [ 739.837130] R13: ffff88046da74008 R14: ffff88046c099000 R15: ffff88046da74000
Jan 3 18:01:59 ceph6 kernel: [ 739.837359] FS: 00007f5815768700(0000) GS:ffff88046fc00000(0000) knlGS:0000000000000000
Jan 3 18:01:59 ceph6 kernel: [ 739.837615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 3 18:01:59 ceph6 kernel: [ 739.837796] CR2: 00007fe8a4fcc13c CR3: 000000046a7f2000 CR4: 00000000000406f0
Jan 3 18:01:59 ceph6 kernel: [ 739.838022] Stack:
Jan 3 18:01:59 ceph6 kernel: [ 739.838085] 0000000000000005 00000000001c0ac0 00000000000e0000 ffff88046c1c37e8
Jan 3 18:01:59 ceph6 kernel: [ 739.838335] ffffffffc03b9e39 ffff88046da74f28 ffff88046da74008 00000000ffd84a52
Jan 3 18:01:59 ceph6 kernel: [ 739.847061] ffff88046da74000 0000000000000000 ffff88046da74008 0000000000000000
Jan 3 18:01:59 ceph6 kernel: [ 739.855800] Call Trace:
Jan 3 18:01:59 ceph6 kernel: [ 739.864529] [] ? i40e_write_rx_ctl+0x39/0x90 [i40e]
Jan 3 18:01:59 ceph6 kernel: [ 739.873487] [] i40e_setup_pf_switch+0x308/0x590 [i40e]
Jan 3 18:01:59 ceph6 kernel: [ 739.882566] [] i40e_probe.part.58+0xd50/0x1be0 [i40e]
Jan 3 18:01:59 ceph6 kernel: [ 739.891572] [] ? radix_tree_lookup+0xd/0x10
Jan 3 18:01:59 ceph6 kernel: [ 739.900540] [] ? irq_to_desc+0x17/0x20
Jan 3 18:01:59 ceph6 kernel: [ 739.909424] [] ? irq_get_irq_data+0xe/0x20
Jan 3 18:01:59 ceph6 kernel: [ 739.918278] [] ? mp_map_pin_to_irq+0xb5/0x300
Jan 3 18:01:59 ceph6 kernel: [ 739.927153] [] ? acpi_ut_remove_reference+0x2e/0x31
Jan 3 18:01:59 ceph6 kernel: [ 739.936072] [] ? __slab_free+0xcb/0x2c0
Jan 3 18:01:59 ceph6 kernel: [ 739.944972] [] ? mp_map_gsi_to_irq+0x98/0xc0
Jan 3 18:01:59 ceph6 kernel: [ 739.953757] [] ? acpi_register_gsi_ioapic+0xbe/0x180
Jan 3 18:01:59 ceph6 kernel: [ 739.962466] [] ? acpi_pci_irq_enable+0x1bf/0x1e4
Jan 3 18:01:59 ceph6 kernel: [ 739.971114] [] ? pci_conf1_read+0xb8/0xf0
Jan 3 18:01:59 ceph6 kernel: [ 739.979739] [] ? raw_pci_read+0x23/0x40
Jan 3 18:01:59 ceph6 kernel: [ 739.988340] [] ? pci_bus_read_config_word+0x9c/0xb0
Jan 3 18:01:59 ceph6 kernel: [ 739.996976] [] ? do_pci_enable_device+0xdd/0x110
Jan 3 18:01:59 ceph6 kernel: [ 740.005459] [] ? pci_enable_device_flags+0xe4/0x130
Jan 3 18:01:59 ceph6 kernel: [ 740.013867] [] i40e_probe+0x1e/0x30 [i40e]
Jan 3 18:01:59 ceph6 kernel: [ 740.022228] [] local_pci_probe+0x45/0xa0
Jan 3 18:01:59 ceph6 kernel: [ 740.030571] [] pci_device_probe+0x103/0x150
Jan 3 18:01:59 ceph6 kernel: [ 740.038806] [] driver_probe_device+0x222/0x4a0
Jan 3 18:01:59 ceph6 kernel: [ 740.046946] [] __driver_attach+0x84/0x90
Jan 3 18:01:59 ceph6 kernel: [ 740.055009] [] ? driver_probe_device+0x4a0/0x4a0
Jan 3 18:01:59 ceph6 kernel: [ 740.063118] [] bus_for_each_dev+0x6c/0xc0
Jan 3 18:01:59 ceph6 kernel: [ 740.071205] [] driver_attach+0x1e/0x20
Jan 3 18:01:59 ceph6 kernel: [ 740.079029] [] bus_add_driver+0x1eb/0x280
Jan 3 18:01:59 ceph6 kernel: [ 740.086629] [] ? 0xffffffffc01ea000
Jan 3 18:01:59 ceph6 kernel: [ 740.093977] [] driver_register+0x60/0xe0
Jan 3 18:01:59 ceph6 kernel: [ 740.101076] [] __pci_register_driver+0x4c/0x50
Jan 3 18:01:59 ceph6 kernel: [ 740.107997] [] i40e_init_module+0xa6/0x1000 [i40e]
Jan 3 18:01:59 ceph6 kernel: [ 740.114831] [] do_one_initcall+0xb3/0x200
Jan 3 18:01:59 ceph6 kernel: [ 740.121523] [] ? kmem_cache_alloc_trace+0x183/0x1f0
Jan 3 18:01:59 ceph6 kernel: [ 740.128183] [] do_init_module+0x5f/0x1cf
Jan 3 18:01:59 ceph6 kernel: [ 740.134706] [] load_module+0x166f/0x1c10
Jan 3 18:01:59 ceph6 kernel: [ 740.141064] [] ? __symbol_put+0x60/0x60
Jan 3 18:01:59 ceph6 kernel: [ 740.147352] [] ? kernel_read+0x50/0x80
Jan 3 18:01:59 ceph6 kernel: [ 740.153662] [<ffffffff8110b...
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've discovered, that above segfault exist only, when the physical link is active when the driver is loading. When the interfaces on the switch are disabled the driver loads successfully and both eth* are present. Some errors occurs when I enable interfaces on the switch on loaded driver:
Jan 4 14:50:34 ceph6 kernel: [ 169.326507] i40e 0000:04:00.1: VEB bw config failed, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EINVAL
Jan 4 14:50:34 ceph6 kernel: [ 169.326515] i40e 0000:04:00.1: Failed configuring TC for VEB seid=161
Jan 4 14:50:34 ceph6 kernel: [ 169.327690] i40e 0000:04:00.1: AQ command Config VSI BW allocation per TC failed = 14
Jan 4 14:50:34 ceph6 kernel: [ 169.327697] i40e 0000:04:00.1: Failed configuring TC map 255 for VSI 391
Jan 4 14:50:34 ceph6 kernel: [ 169.327701] i40e 0000:04:00.1: Failed configuring TC for VSI seid=391
Jan 4 14:50:48 ceph6 kernel: [ 184.011042] i40e 0000:04:00.0: VEB bw config failed, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EINVAL
Jan 4 14:50:48 ceph6 kernel: [ 184.011050] i40e 0000:04:00.0: Failed configuring TC for VEB seid=160
Jan 4 14:50:48 ceph6 kernel: [ 184.013100] i40e 0000:04:00.0: AQ command Config VSI BW allocation per TC failed = 14
Jan 4 14:50:48 ceph6 kernel: [ 184.013107] i40e 0000:04:00.0: Failed configuring TC map 255 for VSI 390
Jan 4 14:50:48 ceph6 kernel: [ 184.013110] i40e 0000:04:00.0: Failed configuring TC for VSI seid=390
Jan 4 14:52:33 ceph6 kernel: [ 289.006692] i40e 0000:04:00.0 eth2: NIC Link is Up 10 Gbps Full Duplex, Flow Control: None
Jan 4 14:52:55 ceph6 kernel: [ 310.487711] i40e 0000:04:00.1 eth3: NIC Link is Up 10 Gbps Full Duplex, Flow Control: None
Beyond that I established 20Gbps LACP, but TX rate does not exceed 10Mbps. RX is ok, over 5Gbps. IRQs are balanced across the cores.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Domel,
Thank you for the post. I can see the module you used is GBC Photonics SP-MM85030D-GP -SFP+ which is not the supported model. Please refer to the URL below for the validated module:
http://www.intel.com/content/www/us/en/support/network-and-i-o/ethernet-products/000007045.html
We recommend to use validated and supported fiber module for X710 series network adapter, can you help check further? thanks.
Rgds,
wb
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HI Domel,
Please feel free to update if you have tested with a supported fiber module.
rgds,
wb
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi wb,
thanks for your advice. We need some more time to check that, I'll keep you informed.
Thanks,
Dominik
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dominik,
Thank you for the reply. I will wait for your further update, hope to hear good news from you.
rgds,
wb
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dominik,
Any update? Please feel free to provide the result.
Thanks,
wb
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
finally we have changed X710 to X520-DA2 and it works ok. I don't know what was wrong with them. We experienced 2-3 situations when these adapters worked with full speed, but after reboot the problems appeard again.
Thanks,
Dominik
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dominik,
Thank you for the update. Are you saying X520-DA2 has the same issue after reboot?
rgds,
wb
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, with X520-DA2 is everything right. Sorry for the imprecise.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dominik,
NO worries and thank you for the clarification. Just to double check are you referring if you are using the validate fiber module on the X710-DA2 the same will occur? Are you going to use the X520 NIC instead, if that is the case any further assistance needed?
Please feel free to update me.
Thanks,
wb
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I don't have any module from supported module list, so I couldn't check that. I will not use X710 at the moment and I don't know if even in the future. If so, I'll let you know. For now you can close this topic.
Thanks!
Dominik
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dominik,
Thank you for the update :) Please feel free to contact us if you have other inquiries.
Rgds,
wb
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page