Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4875 Discussions

AMD-Vi: Completion-Wait loop timed out & NETDEV WATCHDOG: eth1 (i40e): transmit queue 3 timed out

Yanzi
Beginner
2,448 Views

My client's system kernel log as follows. The system running now after reboot.

And I have been told "AMD-Vi: Completion-Wait loop timed out" reported by IOMMU because of no response from the IO device, in my case, it is Intel X710 2x10GbE SFP+ Adapter, running with Debian 11 and Inbox driver for Intel X710.   

The failing system had LACP bonding. 

Based on https://cdrdv2-public.intel.com/337361/lacp-config-guide.pdf, it seems I need disabled LLDP to try because "LACP may not function correctly in certain environments that require LLDP frames containing LCAP information to be forwarded to the network stack.".

Actually the failing system logged FW LLDP is enabled, later seems i40e driver disabling LLDP in firmware, but later FW LLDP is enabled. I do not know if LLDP actually disabled. Client system is running important task now and can't to try beacause system reboot is needed after disabled LLDP.  

I saw there is a similar case, but I do not know why I can't Post Reply to share my information.  Hope AndriiV can see my problem and try to disable LLDP to check if the problem can be solved or not. And let me know the test result.

Intel X710-4 NETDEV WATCHDOG: eth5 (i40e): transmit queue 4 timed out - Intel Communities

 

Please help share your experience on this kind of issue. Thanks!

 

  • Feb 12 11:25:46 music-proxy42.hz.163.org kernel: i40e 0000:41:00.0: FW LLDP is enabled
  • ......
  • Feb 12 11:27:59 music-proxy42.hz.163.org lldpd[2255]: i40e driver detected for eth1, disabling LLDP in firmware
  • ......
  • Feb 12 17:29:35 music-proxy42 kernel: [    3.233365] i40e 0000:41:00.0: FW LLDP is enabled

 

 

Feb 12 10:29:51 music-proxy42 kernel: [37135193.772663] AMD-Vi: Completion-Wait loop timed out
Feb 12 10:29:51 music-proxy42 kernel: [37135193.911042] AMD-Vi: Completion-Wait loop timed out
Feb 12 10:29:51 music-proxy42 kernel: [37135194.048362] AMD-Vi: Completion-Wait loop timed out
Feb 12 10:29:51 music-proxy42 kernel: [37135194.054867] ------------[ cut here ]------------
Feb 12 10:29:51 music-proxy42 kernel: [37135194.054870] NETDEV WATCHDOG: eth1 (i40e): transmit queue 3 timed out
Feb 12 10:29:51 music-proxy42 kernel: [37135194.054912] WARNING: CPU: 50 PID: 265 at net/sched/sch_generic.c:467 dev_watchdog+0x24d/0x260
Feb 12 10:29:51 music-proxy42 kernel: [37135194.054913] Modules linked in: dm_mod sctp_diag raw_diag unix_diag af_packet_diag netlink_diag binfmt_misc udp_diag joydev hid_generic sctp tcp_diag inet_diag msr ip6t_REJECT nf_reject_ipv6 ip6_tables nf_log_ipv6 nft_chain_nat xt_recent xt_comment ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark xt_hashlimit xt_tcpudp xt_CT xt_multiport nft_counter xt_conntrack nft_compat nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_nat nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c nfnetlink bonding amd64_edac_mod edac_mce_amd amd_energy kvm_amd kvm irqbypass ghash_clmulni_intel aesni_intel libaes crypto_simd cryptd glue_helper rapl
Feb 12 10:29:51 music-proxy42 kernel: [37135194.054985] pcspkr mgag200 drm_kms_helper cec i2c_algo_bit ccp sg sp5100_tco rng_core k10temp watchdog evdev ipmi_ssif tcp_bbr sch_fq dummy acpi_cpufreq button ipmi_si ipmi_devintf ipmi_msghandler psmouse usbhid hid drm fuse configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic sd_mod ses t10_pi enclosure crc_t10dif scsi_transport_sas crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ahci xhci_pci libahci xhci_hcd libata i40e megaraid_sas usbcore ptp scsi_mod usb_common i2c_piix4 pps_core
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055035] CPU: 50 PID: 265 Comm: ksoftirqd/50 Not tainted 5.10.0-9-amd64 #1 Debian 5.10.70-1
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055036] Hardware name: Lenovo ThinkSystem SR665/7D2VCTO1WW, BIOS D8E122K-2.20 08/06/2021
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055038] RIP: 0010:dev_watchdog+0x24d/0x260
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055042] Code: a9 ca fd ff eb a9 4c 89 f7 c6 05 34 82 10 01 01 e8 58 9d fa ff 44 89 e9 4c 89 f6 48 c7 c7 98 9d f6 95 48 89 c2 e8 46 2f 14 00 <0f> 0b eb 8a 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055043] RSP: 0018:ffffb43d4d677da8 EFLAGS: 00010286
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055044] RAX: 0000000000000000 RBX: ffff89f068ff9ec0 RCX: ffff8a2f0f098a08
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055045] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff8a2f0f098a00
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055046] RBP: ffff89f0524173dc R08: 0000000000000000 R09: ffffb43d4d677bc8
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055047] R10: ffffb43d4d677bc0 R11: ffff8a2f0eb4c480 R12: ffff89f052417480
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055048] R13: 0000000000000003 R14: ffff89f052417000 R15: ffff89f068ff9f40
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055049] FS: 0000000000000000(0000) GS:ffff8a2f0f080000(0000) knlGS:0000000000000000
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055050] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055050] CR2: 000055e095a5d1c8 CR3: 0000002571502000 CR4: 0000000000350ee0
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055052] Call Trace:
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055059] ? pfifo_fast_enqueue+0x150/0x150
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055063] call_timer_fn+0x29/0xf0
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055065] __run_timers.part.0+0x1d3/0x240
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055069] ? __switch_to_asm+0x42/0x70
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055072] ? __switch_to+0x114/0x460
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055073] run_timer_softirq+0x26/0x50
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055078] __do_softirq+0xc5/0x275
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055082] run_ksoftirqd+0x26/0x40
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055085] smpboot_thread_fn+0xc5/0x160
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055087] ? smpboot_register_percpu_thread+0xf0/0xf0
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055090] kthread+0x11b/0x140
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055092] ? __kthread_bind_mask+0x60/0x60
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055093] ret_from_fork+0x22/0x30
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055095] ---[ end trace 31abd286bf6dc59c ]---
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055105] i40e 0000:41:00.1 eth1: tx_timeout: VSI_seid: 391, Q 3, NTC: 0x16b, HWB: 0x1fc, NTU: 0x1fc, TAIL: 0x1fc, INT: 0x0
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055106] i40e 0000:41:00.1 eth1: tx_timeout recovery level 1, txqueue 3
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055153] i40e 0000:41:00.0 eth0: tx_timeout: VSI_seid: 390, Q 9, NTC: 0x1df, HWB: 0xd7, NTU: 0xd7, TAIL: 0xd7, INT: 0x0
Feb 12 10:29:51 music-proxy42 kernel: [37135194.055154] i40e 0000:41:00.0 eth0: tx_timeout recovery level 1, txqueue 9
Feb 12 10:29:51 music-proxy42 kernel: [37135194.185688] AMD-Vi: Completion-Wait loop timed out
Feb 12 10:29:51 music-proxy42 kernel: [37135194.324049] AMD-Vi: Completion-Wait loop timed out
Feb 12 10:29:51 music-proxy42 kernel: [37135194.461143] AMD-Vi: Completion-Wait loop timed out
Feb 12 10:29:51 music-proxy42 kernel: [37135194.598594] AMD-Vi: Completion-Wait loop timed out
Feb 12 10:29:51 music-proxy42 kernel: [37135194.736618] AMD-Vi: Completion-Wait loop timed out

0 Kudos
3 Replies
Irwan_Intel
Moderator
2,349 Views

Hi Yanzi,


Thank you for posting in Intel Ethernet Communities,


Could you please share your Debian 11 version and kernel version, since AndriiV was using Debian 10.13 with kernel 4.19.0-0.bpo.19-amd64

It might be a bit different comparing to your system. I will also keep you posted once we have AndriiV test result.


You could also refer to the following document link to disable LLDP when using LACP.

https://www.intel.com/content/www/us/en/content-details/337361/lacp-configuration-guide-using-intel-ethernet-500-and-700-series-network-adapters-and-various-server-operating-systems-technical-brief.html?wapkw=lacp%20guide


Looking forward to your reply.


Best Regards,


Irwan_Intel

Intel Customer Support


0 Kudos
Irwan_Intel
Moderator
2,304 Views

Hi Yanzi,


We have not heard from you since our last communication and we would like to know if you have a questions or need further assistance in regards to your case. Unfortunately, we haven't heard back from AndriiV to share his test result.


You may also refer to the link as shared previously. Thanks!


https://www.intel.com/content/www/us/en/content-details/337361/lacp-configuration-guide-using-intel-ethernet-500-and-700-series-network-adapters-and-various-server-operating-systems-technical-brief.html?wapkw=lacp%20guide


Feel free to let us know if you have questions or clarifications. 


Regards,


Irwan_Intel

Intel Customer Support



0 Kudos
Irwan_Intel
Moderator
2,268 Views

Hi Yanzi,

 

Good day and I hope this message finds you well!

 

Please be informed that we will now close this request since we haven't received any response from our previous follow ups. Just feel free to post a new question if you may have any other inquiry in the future as this thread will no longer be monitored.

 

Best Regards,

  

Irwan_Intel

Intel Customer Support


0 Kudos
Reply