- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please, can you help us?
Details of kernel panic:
[ 796.119373] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:1H:29293]
[ 796.127078] Modules linked in: echainiv esp4 xt_addrtype ip_set_hash_net xt_NFLOG xt_devgroup xt_hashlimit xt_CT xt_REDIRECT xt_multiport xfrm_interface twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common ip6table_nat ip6table_mangle ip6table_raw ip6table_filter ip6_tables xt_MASQUERADE xt_conntrack xt_set ip_set_hash_ip ip_set serpent_sse2_x86_64 serpent_generic cast5_generic cast_common xt_connmark xt_mark xt_connlabel iptable_nat iptable_mangle iptable_raw des_generic libdes crypto_user camellia_generic camellia_x86_64 xcbc md4 iptable_filter nf_nat_ftp nf_conntrack_ftp nf_nat_sip nf_conntrack_sip nf_nat_tftp nf_conntrack_tftp nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_conntrack_pptp nf_nat ip_gre ip_tunnel gre tun macvlan qat_api(O) usdm_drv(O) nfnetlink_log nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nfnetlink sunrpc sha512_ssse3 sha512_generic qat_c3xxx(O) pnd2_edac intel_qat(O) x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm
[ 796.127123] irqbypass iTCO_wdt rapl iTCO_vendor_support intel_cstate uio pcspkr i2c_i801 pinctrl_denverton authenc pinctrl_intel tpm_infineon i2c_ismt acpi_cpufreq tcp_htcp ip_tables ext4 mbcache jbd2 dm_crypt blowfish_generic blowfish_x86_64 blowfish_common mmc_block crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel glue_helper crypto_simd cryptd sdhci_pci cqhci sdhci ixgbe(O) mmc_core igb(O) dca ptp pps_core ahci libahci libata dm_mirror dm_region_hash dm_log dm_mod fuse
[ 796.127154] CPU: 2 PID: 29293 Comm: kworker/2:1H Tainted: G O L 5.4.113-1.el7.elrepo.x86_64 #1
[ 796.127155] Hardware name: Silicom 90500-0151-G01/90500-0151-G01, BIOS MADRID-01.00.18.06 06/01/2020
[ 796.127174] Workqueue: adf_pf_resp_wq_0 adf_response_handler_wq [intel_qat]
[ 796.127180] RIP: 0010:native_queued_spin_lock_slowpath+0x60/0x1d0
[ 796.127183] Code: 6e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 48 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 5d 66 89 07 c3 8b 37 81 fe 00 01
[ 796.127185] RSP: 0018:ffffbaa3c00eca90 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
[ 796.127188] RAX: 0000000000000101 RBX: 0000000000000000 RCX: 0000000000000007
[ 796.127189] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9157d9ae02fc
[ 796.127191] RBP: ffffbaa3c00eca90 R08: 0000000000000032 R09: ffff9157d9ae02c0
[ 796.127193] R10: 0000000000000002 R11: 0000000000000032 R12: 0000000000000002
[ 796.127194] R13: ffff9157e3443400 R14: 0000000000000000 R15: ffff9157d9ae02fc
[ 796.127197] FS: 0000000000000000(0000) GS:ffff915837b00000(0000) knlGS:0000000000000000
[ 796.127199] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 796.127200] CR2: 00007f1958e5d0a0 CR3: 000000019c20a000 CR4: 00000000003406e0
[ 796.127202] Call Trace:
[ 796.127204] <IRQ>
[ 796.127208] _raw_spin_lock+0x1e/0x30
[ 796.127211] xfrm_input+0x1b0/0xa00
[ 796.127215] xfrm4_rcv+0x3b/0x40
[ 796.127218] xfrm4_esp_rcv+0x39/0x50
[ 796.127222] ip_protocol_deliver_rcu+0x1a6/0x1b0
[ 796.127226] ip_local_deliver_finish+0x48/0x50
[ 796.127229] ip_local_deliver+0xe5/0xf0
[ 796.127233] ? ip_protocol_deliver_rcu+0x1b0/0x1b0
[ 796.127236] ip_sublist_rcv_finish+0x5e/0x70
[ 796.127240] ip_sublist_rcv+0x219/0x2b0
[ 796.127244] ? ip_rcv_finish_core.isra.0+0x3c0/0x3c0
[ 796.127248] ip_list_rcv+0x134/0x160
[ 796.127252] __netif_receive_skb_list_core+0x28d/0x2b0
[ 796.127256] netif_receive_skb_list_internal+0x1d5/0x300
[ 796.127271] ? ixgbe_clean_rx_irq+0x2cd/0xbb0 [ixgbe]
[ 796.127275] gro_normal_list.part.0+0x1e/0x40
[ 796.127278] napi_complete_done+0x91/0x140
[ 796.127293] ixgbe_poll+0x413/0x650 [ixgbe]
[ 796.127297] net_rx_action+0x147/0x3b0
[ 796.127300] __do_softirq+0xe1/0x2d6
[ 796.127304] irq_exit+0xe5/0xf0
[ 796.127308] do_IRQ+0x5a/0xf0
[ 796.127311] common_interrupt+0xf/0xf
[ 796.127313] </IRQ>
[ 796.127317] RIP: 0010:netlink_has_listeners+0xc/0x60
[ 796.127319] Code: 41 bc ea ff ff ff e9 b0 fe ff ff 31 f6 eb 9f 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 8b 87 fc 02 00 00 <f7> d0 48 89 e5 83 e0 01 75 3d 0f b6 97 11 02 00 00 48 8d 0c 52 48
[ 796.127321] RSP: 0018:ffffbaa3c238fcf8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffffd3
[ 796.127324] RAX: 0000000000000001 RBX: 00000000ffffffff RCX: 0000000000000001
[ 796.127325] RDX: 000000000001ffff RSI: 0000000000000005 RDI: ffff9158313c3800
[ 796.127327] RBP: ffffbaa3c238fd10 R08: 0000000000000000 R09: ffff9157d9ae02c0
[ 796.127329] R10: 0000000000000000 R11: 0000000000007e04 R12: ffff9157d9ae02c0
[ 796.127331] R13: ffff9157e3443b00 R14: 0000000000000000 R15: ffff9157d9ae02fc
[ 796.127336] ? xfrm_replay_advance+0x52/0xc0
[ 796.127339] xfrm_input+0x559/0xa00
[ 796.127342] xfrm_input_resume+0x15/0x20
[ 796.127346] esp_input_done+0x21/0x30 [esp4]
[ 796.127366] qat_aead_alg_callback+0x9b/0xb0 [intel_qat]
[ 796.127386] qat_alg_callback+0x22/0x30 [intel_qat]
[ 796.127403] adf_handle_response+0x4b/0xd0 [intel_qat]
[ 796.127421] adf_response_handler_wq+0x84/0xe0 [intel_qat]
[ 796.127424] process_one_work+0x1b5/0x370
[ 796.127428] worker_thread+0x50/0x3d0
[ 796.127432] kthread+0x106/0x140
[ 796.127434] ? process_one_work+0x370/0x370
[ 796.127437] ? kthread_park+0x90/0x90
[ 796.127441] ret_from_fork+0x35/0x40
[ 802.714196] rcu: INFO: rcu_sched self-detected stall on CPU
[ 802.714203] rcu: 2-....: (1 GPs behind) idle=be2/1/0x4000000000000004 softirq=144695/144696 fqs=14965
[ 802.714206] (t=60000 jiffies g=359009 q=147256)
[ 802.714209] NMI backtrace for cpu 2
[ 802.714213] CPU: 2 PID: 29293 Comm: kworker/2:1H Tainted: G O L 5.4.113-1.el7.elrepo.x86_64 #1
[ 802.714215] Hardware name: Silicom 90500-0151-G01/90500-0151-G01, BIOS MADRID-01.00.18.06 06/01/2020
[ 802.714235] Workqueue: adf_pf_resp_wq_0 adf_response_handler_wq [intel_qat]
[ 802.714238] Call Trace:
[ 802.714240] <IRQ>
[ 802.714245] dump_stack+0x6d/0x8b
[ 802.714249] ? lapic_can_unplug_cpu+0x80/0x80
[ 802.714253] nmi_cpu_backtrace.cold+0x14/0x53
[ 802.714258] nmi_trigger_cpumask_backtrace+0xd9/0xe0
[ 802.714262] arch_trigger_cpumask_backtrace+0x19/0x20
[ 802.714266] rcu_dump_cpu_stacks+0x9c/0xce
[ 802.714270] rcu_sched_clock_irq.cold+0x1dc/0x3c4
[ 802.714276] update_process_times+0x2c/0x60
[ 802.714281] tick_sched_handle+0x29/0x60
[ 802.714284] tick_sched_timer+0x3d/0x80
[ 802.714287] __hrtimer_run_queues+0xf7/0x270
[ 802.714291] ? tick_sched_do_timer+0x70/0x70
[ 802.714294] hrtimer_interrupt+0x109/0x220
[ 802.714298] smp_apic_timer_interrupt+0x71/0x140
[ 802.714303] apic_timer_interrupt+0xf/0x20
[ 802.714308] RIP: 0010:native_queued_spin_lock_slowpath+0x60/0x1d0
[ 802.714311] Code: 6e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 48 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 5d 66 89 07 c3 8b 37 81 fe 00 01
[ 802.714313] RSP: 0018:ffffbaa3c00eca90 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
[ 802.714316] RAX: 0000000000000101 RBX: 0000000000000000 RCX: 0000000000000007
[ 802.714317] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9157d9ae02fc
[ 802.714319] RBP: ffffbaa3c00eca90 R08: 0000000000000032 R09: ffff9157d9ae02c0
[ 802.714321] R10: 0000000000000002 R11: 0000000000000032 R12: 0000000000000002
[ 802.714323] R13: ffff9157e3443400 R14: 0000000000000000 R15: ffff9157d9ae02fc
[ 802.714327] ? apic_timer_interrupt+0xa/0x20
[ 802.714332] _raw_spin_lock+0x1e/0x30
[ 802.714335] xfrm_input+0x1b0/0xa00
[ 802.714340] xfrm4_rcv+0x3b/0x40
[ 802.714344] xfrm4_esp_rcv+0x39/0x50
[ 802.714348] ip_protocol_deliver_rcu+0x1a6/0x1b0
[ 802.714352] ip_local_deliver_finish+0x48/0x50
[ 802.714355] ip_local_deliver+0xe5/0xf0
[ 802.714359] ? ip_protocol_deliver_rcu+0x1b0/0x1b0
[ 802.714363] ip_sublist_rcv_finish+0x5e/0x70
[ 802.714367] ip_sublist_rcv+0x219/0x2b0
[ 802.714372] ? ip_rcv_finish_core.isra.0+0x3c0/0x3c0
[ 802.714376] ip_list_rcv+0x134/0x160
[ 802.714380] __netif_receive_skb_list_core+0x28d/0x2b0
[ 802.714384] netif_receive_skb_list_internal+0x1d5/0x300
[ 802.714400] ? ixgbe_clean_rx_irq+0x2cd/0xbb0 [ixgbe]
[ 802.714405] gro_normal_list.part.0+0x1e/0x40
[ 802.714408] napi_complete_done+0x91/0x140
[ 802.714424] ixgbe_poll+0x413/0x650 [ixgbe]
[ 802.714428] net_rx_action+0x147/0x3b0
[ 802.714432] __do_softirq+0xe1/0x2d6
[ 802.714436] irq_exit+0xe5/0xf0
[ 802.714441] do_IRQ+0x5a/0xf0
[ 802.714444] common_interrupt+0xf/0xf
[ 802.714446] </IRQ>
[ 802.714450] RIP: 0010:netlink_has_listeners+0xc/0x60
[ 802.714453] Code: 41 bc ea ff ff ff e9 b0 fe ff ff 31 f6 eb 9f 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 8b 87 fc 02 00 00 <f7> d0 48 89 e5 83 e0 01 75 3d 0f b6 97 11 02 00 00 48 8d 0c 52 48
[ 802.714455] RSP: 0018:ffffbaa3c238fcf8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffffd3
[ 802.714457] RAX: 0000000000000001 RBX: 00000000ffffffff RCX: 0000000000000001
[ 802.714459] RDX: 000000000001ffff RSI: 0000000000000005 RDI: ffff9158313c3800
[ 802.714461] RBP: ffffbaa3c238fd10 R08: 0000000000000000 R09: ffff9157d9ae02c0
[ 802.714463] R10: 0000000000000000 R11: 0000000000007e04 R12: ffff9157d9ae02c0
[ 802.714465] R13: ffff9157e3443b00 R14: 0000000000000000 R15: ffff9157d9ae02fc
[ 802.714470] ? xfrm_replay_advance+0x52/0xc0
[ 802.714473] xfrm_input+0x559/0xa00
[ 802.714477] xfrm_input_resume+0x15/0x20
[ 802.714482] esp_input_done+0x21/0x30 [esp4]
[ 802.714503] qat_aead_alg_callback+0x9b/0xb0 [intel_qat]
[ 802.714524] qat_alg_callback+0x22/0x30 [intel_qat]
[ 802.714543] adf_handle_response+0x4b/0xd0 [intel_qat]
[ 802.714563] adf_response_handler_wq+0x84/0xe0 [intel_qat]
[ 802.714567] process_one_work+0x1b5/0x370
[ 802.714570] worker_thread+0x50/0x3d0
[ 802.714574] kthread+0x106/0x140
[ 802.714577] ? process_one_work+0x370/0x370
[ 802.714580] ? kthread_park+0x90/0x90
[ 802.714584] ret_from_fork+0x35/0x40
[ 803.916248] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 2-... } 60364 jiffies s: 1097 root: 0x4/.
[ 803.916261] rcu: blocking rcu_node structures:
[ 803.916265] Task dump for CPU 2:
[ 803.916269] kworker/2:1H R running task 0 29293 2 0x80004088
[ 803.916315] Workqueue: adf_pf_resp_wq_0 adf_response_handler_wq [intel_qat]
[ 803.916319] Call Trace:
[ 803.916354] ? adf_handle_response+0x4b/0xd0 [intel_qat]
[ 803.916383] ? adf_response_handler_wq+0x84/0xe0 [intel_qat]
[ 803.916392] ? process_one_work+0x1b5/0x370
[ 803.916397] ? worker_thread+0x50/0x3d0
[ 803.916404] ? kthread+0x106/0x140
[ 803.916408] ? process_one_work+0x370/0x370
[ 803.916412] ? kthread_park+0x90/0x90
[ 803.916420] ? ret_from_fork+0x35/0x40
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Lucas,
Thanks for reporting the latest update regarding this issue.
Based on the scenarios that you reported, this is definitely an issue with the environment in which you are running QAT, with that in mind, is it possible for you to test a regular Linux distribution? CentOS 7* or Red Hat* downloaded from the official website? I am trying to rule out any particular customization done by ELRepo Project.
Also, are you aware of any particular difference between having pcryto not enabled vs blacklisted?
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Lucas,
Do you have any update by any chance?
Did you have a chance to test an official CentOS 7* or Red Hat* downloaded from the official website?
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Ronny,
Not yet, we will recompile your kernel 5.4.113 without pcrypt, in your ".configure".
However, due to some internal issues, it will take some time, I believe only in July.
Best Regards
Lucas Pereira
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lucas,
Thanks for the update, if more testing is only going to be possible in July timeframe, can we close this case and reopen or create a new one whenever you have the opportunity for more debugging?
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lucas,
Thanks for replying back, I am going to close this case for now and we will resume debugging in July time frame if needed.
Regards,
Ronny G
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »