IPSEC Tunnel XFRM VTI + QAT LKCF rcu_sched stall

SmithConnect · ‎01-27-2025

Greetings,

I am experiencing an issue with the use of QAT in conjunction with IKEv2 IPSEC site to site tunnels built using Strongswan. The issue is replicable after a large amount of traffic is sent over the tunnel, typically after running a test such as "iperf -c x.x.x.x -P 25 -t 60". The system will ultimately begin reporting a soft lockup on CPU# and must be restarted to restore functionality.

I can confirm this issue only occurs when intel_qat is loaded on the system as disabling the module removes the issue entirely (and along with it the speed benefits of using QAT).

Jan 25 23:41:26 debian kernel: [13816.626750] vmxnet3 0000:13:00.0 eth1: NETDEV WATCHDOG: CPU: 1: transmit queue 2 timed out 12347449 ms
Jan 25 23:41:26 debian kernel: [13816.629901] vmxnet3 0000:13:00.0 eth1: tx hang
Jan 25 23:41:26 debian kernel: [13816.632882] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 0-.... } 12343656 jiffies s: 545 root: 0x1/.
Jan 25 23:41:26 debian kernel: [13816.635987] rcu: blocking rcu_node structures (internal RCU debug):
Jan 25 23:41:26 debian kernel: [13816.639006] Sending NMI from CPU 1 to CPUs 0:
Jan 25 23:41:26 debian kernel: [13816.639036] NMI backtrace for cpu 0
Jan 25 23:41:26 debian kernel: [13816.639040] CPU: 0 PID: 56 Comm: kworker/0:1H Tainted: G           O L     6.6.69-vyos #1
Jan 25 23:41:26 debian kernel: [13816.639046] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.18227214.B64.2106252220 06/25/2021
Jan 25 23:41:26 debian kernel: [13816.639050] Workqueue: adf_vf_resp_wq_ adf_response_handler_wq [intel_qat]
Jan 25 23:41:26 debian kernel: [13816.639162] RIP: 0010:native_queued_spin_lock_slowpath+0x2b/0x260
Jan 25 23:41:26 debian kernel: [13816.639173] Code: 55 41 54 55 53 48 89 fb 66 90 ba 01 00 00 00 8b 03 85 c0 75 13 f0 0f b1 13 85 c0 75 f2 5b 5d 41 5c 41 5d c3 cc cc cc cc f3 90 <eb> e3 81 fe 00 01 00 00 74 4a 81 fe ff 00 00 00 77 77 f0 0f ba 2b
Jan 25 23:41:26 debian kernel: [13816.639177] RSP: 0018:ffffb44d80003ae8 EFLAGS: 00000202
Jan 25 23:41:26 debian kernel: [13816.639182] RAX: 0000000000000001 RBX: ffff8ce4b44af34c RCX: ffff8ce4b44af348
Jan 25 23:41:26 debian kernel: [13816.639186] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8ce4b44af34c
Jan 25 23:41:26 debian kernel: [13816.639189] RBP: ffffb44d80003b78 R08: 0000000069d34ec2 R09: 000000000000000a
Jan 25 23:41:26 debian kernel: [13816.639193] R10: 0000000000000004 R11: ffff8ce4848f33c8 R12: 0000000000000002
Jan 25 23:41:26 debian kernel: [13816.639197] R13: 000000000000000a R14: ffff8ce4b44af34c R15: ffff8ce4b44af300
Jan 25 23:41:26 debian kernel: [13816.639201] FS:  0000000000000000(0000) GS:ffff8ce4bbc00000(0000) knlGS:0000000000000000
Jan 25 23:41:26 debian kernel: [13816.639205] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 25 23:41:26 debian kernel: [13816.639209] CR2: 0000559076aec3c0 CR3: 0000000105c50003 CR4: 00000000003706f0
Jan 25 23:41:26 debian kernel: [13816.639218] Call Trace:
Jan 25 23:41:26 debian kernel: [13816.639221]  <NMI>
Jan 25 23:41:26 debian kernel: [13816.639238]  ? nmi_cpu_backtrace+0x95/0x110
Jan 25 23:41:26 debian kernel: [13816.639245]  ? nmi_cpu_backtrace_handler+0x8/0x10
Jan 25 23:41:26 debian kernel: [13816.639253]  ? nmi_handle+0x4e/0x120
Jan 25 23:41:26 debian kernel: [13816.639261]  ? default_do_nmi+0x44/0x250
Jan 25 23:41:26 debian kernel: [13816.639267]  ? exc_nmi+0xfe/0x130
Jan 25 23:41:26 debian kernel: [13816.639273]  ? end_repeat_nmi+0x16/0x67
Jan 25 23:41:26 debian kernel: [13816.639286]  ? native_queued_spin_lock_slowpath+0x2b/0x260
Jan 25 23:41:26 debian kernel: [13816.639294]  ? native_queued_spin_lock_slowpath+0x2b/0x260
Jan 25 23:41:26 debian kernel: [13816.639302]  ? native_queued_spin_lock_slowpath+0x2b/0x260
Jan 25 23:41:26 debian kernel: [13816.639310]  </NMI>
Jan 25 23:41:26 debian kernel: [13816.639311]  <IRQ>
Jan 25 23:41:26 debian kernel: [13816.639314]  _raw_spin_lock+0x19/0x20
Jan 25 23:41:26 debian kernel: [13816.639320]  xfrm_input+0x1dc/0x11f0
Jan 25 23:41:26 debian kernel: [13816.639333]  xfrm6_rcv_encap+0xec/0x1e0
Jan 25 23:41:26 debian kernel: [13816.639344]  ? __pfx_xfrm6_udp_encap_rcv+0x10/0x10
Jan 25 23:41:26 debian kernel: [13816.639352]  udpv6_queue_rcv_one_skb+0x259/0x520
Jan 25 23:41:26 debian kernel: [13816.639361]  udp6_unicast_rcv_skb+0x40/0xa0
Jan 25 23:41:26 debian kernel: [13816.639369]  ip6_protocol_deliver_rcu+0x181/0x480
Jan 25 23:41:26 debian kernel: [13816.639377]  ip6_input_finish+0x35/0x60
Jan 25 23:41:26 debian kernel: [13816.639384]  ip6_sublist_rcv_finish+0x54/0x90
Jan 25 23:41:26 debian kernel: [13816.639392]  ip6_sublist_rcv+0x236/0x2d0
Jan 25 23:41:26 debian kernel: [13816.639399]  ? __pfx_ip6_rcv_finish+0x10/0x10
Jan 25 23:41:26 debian kernel: [13816.639407]  ipv6_list_rcv+0x136/0x160
Jan 25 23:41:26 debian kernel: [13816.639416]  __netif_receive_skb_list_core+0x1f1/0x2c0
Jan 25 23:41:26 debian kernel: [13816.639429]  netif_receive_skb_list_internal+0x1a7/0x2d0
Jan 25 23:41:26 debian kernel: [13816.639437]  napi_complete_done+0x69/0x1a0
Jan 25 23:41:26 debian kernel: [13816.639443]  vmxnet3_poll_rx_only+0x7b/0xa0 [vmxnet3]
Jan 25 23:41:26 debian kernel: [13816.639480]  __napi_poll+0x23/0x1a0
Jan 25 23:41:26 debian kernel: [13816.639485]  net_rx_action+0x141/0x2c0
Jan 25 23:41:26 debian kernel: [13816.639490]  ? __napi_schedule+0xa7/0xb0
Jan 25 23:41:26 debian kernel: [13816.639498]  handle_softirqs+0xd2/0x280
Jan 25 23:41:26 debian kernel: [13816.639505]  __irq_exit_rcu+0x68/0x90
Jan 25 23:41:26 debian kernel: [13816.639509]  common_interrupt+0x7a/0xa0
Jan 25 23:41:26 debian kernel: [13816.639515]  </IRQ>
Jan 25 23:41:26 debian kernel: [13816.639516]  <TASK>
Jan 25 23:41:26 debian kernel: [13816.639518]  asm_common_interrupt+0x22/0x40
Jan 25 23:41:26 debian kernel: [13816.639526] RIP: 0010:netlink_has_listeners+0x2e/0x60
Jan 25 23:41:26 debian kernel: [13816.639535] Code: 03 00 00 a8 01 74 44 0f b7 87 04 02 00 00 48 8d 14 40 48 8d 04 90 31 d2 48 c1 e0 04 48 03 05 39 d8 d3 00 48 8b 88 90 00 00 00 <48> 85 c9 74 15 83 ee 01 3b b0 9c 00 00 00 73 0a 31 d2 48 0f a3 71
Jan 25 23:41:26 debian kernel: [13816.639540] RSP: 0018:ffffb44d8069fd48 EFLAGS: 00000282
Jan 25 23:41:26 debian kernel: [13816.639543] RAX: ffff8ce4803ca4e0 RBX: ffff8ce4b44af300 RCX: ffff8ce4b442b540
Jan 25 23:41:26 debian kernel: [13816.639547] RDX: 0000000000000000 RSI: 0000000000000005 RDI: ffff8ce4b234a000
Jan 25 23:41:26 debian kernel: [13816.639551] RBP: ffffb44d8069fdf0 R08: 0000000000000000 R09: 0000000000004bb7
Jan 25 23:41:26 debian kernel: [13816.639554] R10: 0000000000000010 R11: ffffffff86427f80 R12: 00000000b74b0000
Jan 25 23:41:26 debian kernel: [13816.639558] R13: 000000000000000a R14: ffff8ce4b44af34c R15: ffff8ce4b44af300
Jan 25 23:41:26 debian kernel: [13816.639565]  ? skb_copy_bits+0x1da/0x210
Jan 25 23:41:26 debian kernel: [13816.639571]  xfrm_replay_advance+0xf8/0x360
Jan 25 23:41:26 debian kernel: [13816.639582]  xfrm_input+0x4ce/0x11f0
Jan 25 23:41:26 debian kernel: [13816.639592]  qat_alg_callback+0x15/0x30 [intel_qat]
Jan 25 23:41:26 debian kernel: [13816.639711]  adf_handle_response+0x3d/0xc0 [intel_qat]
Jan 25 23:41:26 debian kernel: [13816.639815]  adf_response_handler_wq+0x6c/0xc0 [intel_qat]
Jan 25 23:41:26 debian kernel: [13816.639928]  process_one_work+0x175/0x310
Jan 25 23:41:26 debian kernel: [13816.639936]  worker_thread+0x279/0x3a0
Jan 25 23:41:26 debian kernel: [13816.639944]  ? __pfx_worker_thread+0x10/0x10
Jan 25 23:41:26 debian kernel: [13816.639949]  kthread+0xc4/0xf0
Jan 25 23:41:26 debian kernel: [13816.639959]  ? __pfx_kthread+0x10/0x10
Jan 25 23:41:26 debian kernel: [13816.639968]  ret_from_fork+0x28/0x40
Jan 25 23:41:26 debian kernel: [13816.639974]  ? __pfx_kthread+0x10/0x10
Jan 25 23:41:26 debian kernel: [13816.639983]  ret_from_fork_asm+0x1b/0x30
Jan 25 23:41:26 debian kernel: [13816.639995]  </TASK>

Some details regarding the setup in place.

OS: Debian 12

Driver Version: 4.27.0-00006

Configure:

./configure --enable-kapi --enable-qat-lkcf --enable-icp-sriov=guest

lsmod | grep qat

root@debian:~$ lsmod | grep qat
qat_c62xvf             32768  2
intel_qat             401408  7 qat_c62xvf
uio                    28672  1 intel_qat

service qat_service status

root@debian:~$ service qat_service status
○ qat_service.service - LSB: modprobe the QAT modules, which loads dependant modules, before calling the user space utility to pass configuration parameters
     Loaded: loaded (/etc/init.d/qat_service; generated)
     Active: inactive (dead)
       Docs: man:systemd-sysv-generator(8)

cat /proc/ | grep qat

root@debian:~$ cat /proc/ | grep qat
driver       : echainiv(qat_aes_cbc_hmac_sha256)
driver       : rfc3686(qat_aes_ctr)
driver       : pkcs1pad(qat-rsa,sha512)
driver       : qat-rsa
module       : intel_qat
driver       : qat_aes_gcm
module       : intel_qat
driver       : qat_aes_cbc_hmac_sha512
module       : intel_qat
driver       : qat_aes_cbc_hmac_sha256
module       : intel_qat
driver       : qat_aes_xts
module       : intel_qat
driver       : qat_aes_ctr
module       : intel_qat
driver       : qat_aes_cbc
module       : intel_qat

/etc/c6xxvf_dev0.conf

#  version: QAT.L.4.27.0-00006
################################################################
[GENERAL]
ServicesEnabled = cy;dc

ConfigVersion = 2

#Default values for number of concurrent requests*/
CyNumConcurrentSymRequests = 512
CyNumConcurrentAsymRequests = 64

#Statistics, valid values: 1,0
statsGeneral = 1
statsDh = 1
statsDrbg = 1
statsDsa = 1
statsEcc = 1
statsKeyGen = 1
statsDc = 1
statsLn = 1
statsPrime = 1
statsRsa = 1
statsSym = 1

##############################################
# Kernel Instances Section
##############################################
[KERNEL]
NumberCyInstances = 1
NumberDcInstances = 1

#  - Kernel instance #0
Cy0Name = "IPSec0"
Cy0IsPolled = 0
Cy0CoreAffinity = 0

# Data Compression - Kernel instance #0
Dc0Name = "Dc1"
Dc0IsPolled = 0
# List of core affinities
Dc0CoreAffinity = 0

##############################################
# User Process Instance Section
##############################################
[SSL]
NumberCyInstances = 0
NumberDcInstances = 0
NumProcesses = 1
LimitDevAccess = 0

I look forward to any suggestions from the community here as to troubleshoot or further pinpoint the root cause of the crashing.

Ronny_G_Intel · ‎01-29-2025

Hi SmithConnect

The log entries you provided indicate a series of issues related to the Intel QuickAssist Technology (QAT) and network driver (vmxnet3).

The key messages include a network transmit queue timeout, a CPU stall detected by RCU (Read-Copy-Update), and a backtrace involving the QAT driver.

1. Network Driver (vmxnet3) Issues

Transmit Queue Timeout: The message NETDEV WATCHDOG: CPU: 1: transmit queue 2 timed out indicates that the network driver (vmxnet3) is experiencing a transmit queue timeout. This can be caused by high network load, driver issues, or hardware problems.

2. RCU (Read-Copy-Update) Stalls

RCU Stalls: The message rcu_sched detected expedited stalls on CPUs/tasks indicates that the RCU subsystem detected a stall, which can be caused by long-running tasks or high CPU load.

3. Intel QAT Driver Issues

QAT Driver Backtrace: The backtrace involving the QAT driver (adf_response_handler_wq) suggests that there may be an issue with the QAT driver or its interaction with the system.

Now, issues 1 and 2 can be affecting the functionality of QAT:

If the network driver is experiencing timeouts or other issues, it can lead to delays or failures in data transmission. Since Intel QAT often handles cryptographic operations for network traffic, any disruption in the network driver can directly impact the performance and reliability of QAT operations.

RCU stalls indicate that the system is experiencing high CPU load or long-running tasks, which can affect the overall system performance. Intel QAT relies on the CPU for processing cryptographic operations, and any CPU-related issues can degrade the performance of QAT. High CPU load can also lead to increased latency and reduced throughput for QAT operations.

I would recommend addressing the first 2 issues and see if when running QAT still behaves the same.

If the issue persists after trying the recommendations above please share the following data:

1. icp_dump log files. To generate this, run the script located at $ICP_ROOT/quickassist/utilities/debug_tool/icp_dump.sh. This will create a tar file containing your full system setup, including configuration files.

2. config.log file. It should be located in the $ICP_ROOT/ directory.

3. dmesg log file, same log that you just provided.

Thanks,

Ronny G

SmithConnect · ‎01-29-2025

Greetings Ronny,

While I appreciate it, response provided does not give me any truly actionable steps.

"If the network driver is experiencing timeouts or other issues, it can lead to delays or failures in data transmission. Since Intel QAT often handles cryptographic operations for network traffic, any disruption in the network driver can directly impact the performance and reliability of QAT operations."

I can run the same amount of traffic through a tunnel with the QAT module uninstalled and I do not experience any cpu stalls/crashes. If the network driver were at fault, the problem should still persists in this alternate configuration.

"RCU stalls indicate that the system is experiencing high CPU load or long-running tasks, which can affect the overall system performance. Intel QAT relies on the CPU for processing cryptographic operations, and any CPU-related issues can degrade the performance of QAT. High CPU load can also lead to increased latency and reduced throughput for QAT operations"

While CPU usage increases, as would be expected as the CPU handles the interrupts that are required for QAT operations. The issue at hand is not increased latency or reduced throughput but rather the processing of anything on the machine fails.

Ronny_G_Intel · ‎01-30-2025

Hi SmithConnect,

Thank you for providing the icp_dump and config.log files. I will review them and get back to you as soon as possible.

I have one more question: Based on your previous response, can I assume that if the QAT module is uninstalled, the vmxnet3 timeout issues and RCU stalls do not occur?

Regards,

Ronny G

SmithConnect · ‎02-01-2025

Greetings Ronny,

That assumptions is correct. When the QAT module is not installed, there are no stability issues.

Regards,

SmithConnect · ‎02-06-2025

Greetings Ronny,

Is there any additional information I can provide to continue aiding in your review of this issue?

Regards,

Ronny_G_Intel · ‎02-12-2025

Hi SmithConnect,

I checked your configuration files and logs and couldn't really detect any particular issue.

Please confirm that you are mainly using vfs, see below:

There is 4 QAT acceleration device(s) in the system:

qat_dev0 - type: c6xxvf, inst_id: 0, node_id: 0, bsf: 0000:05:00.0, #accel: 1 #engines: 1 state: up

qat_dev1 - type: c6xxvf, inst_id: 1, node_id: 0, bsf: 0000:0c:00.0, #accel: 1 #engines: 1 state: up

qat_dev2 - type: c6xxvf, inst_id: 2, node_id: 0, bsf: 0000:14:00.0, #accel: 1 #engines: 1 state: up

qat_dev3 - type: c6xxvf, inst_id: 3, node_id: 0, bsf: 0000:1c:00.0, #accel: 1 #engines: 1 state: up

And that intel_iommu is not enabled: BOOT_IMAGE=/boot/1.5-rolling-202501151018/vmlinuz boot=live rootdelay=5 noautologin net.ifnames=0 biosdevname=0 vyos-union=/boot/1.5-rolling-202501151018 console=tty0

I am running this issue by the QAT team and will get back to you as soon as possible.

Regards,

Ronny G

Ronny_G_Intel · ‎02-12-2025

Hi SmithConnect,

When running Intel QAT in a virtual machine, you must have IOMMU (Input/Output Memory Management Unit) enabled by setting "iommu=on" in your system configuration, as IOMMU is crucial for properly managing memory access for the QAT device within a virtualized environment.

Based on the configuration, it seems that you are operating within a VM. Intel IOMMU needs to be enabled on the host side.

On the other hand, is this a new issue? Was this setup functioning properly before and then stopped, or is this a new setup where the issue has just been identified?

I reviewed the provided dmesg file, and contains only six QAT references. I didn't notice any errors in this log. Do you have any logs that include errors?

Thanks,

Ronny G

Ronny_G_Intel · ‎02-18-2025

Hi SmithConnect,

I am just checking if you had a chance to look into my previous comment?

Let me know if you have any update.

Regards,

Ronny G

SmithConnect · ‎02-18-2025

Greetings Ronny,

Sorry for the delay.

Q: Please confirm that you are mainly using vfs.

A: Yes VFS is used. The QAT driver installed is built with "--enable-icp-sriov=guest" as well.

Q: Intel IOMMU needs to be enabled on the host side.

A: The host does indeed have IOMMU (Intel-VT-d in this case) enabled. The host is running ESXi with "Intel(R) QuickAssist Technology (Intel(R) QAT) Driver for VMware ESXi" version "1.5.1.54-1OEM.700.1.0.15843807" installed.

Q: And that intel_iommu is not enabled

A: The guest VM, does not have IOMMU enabled:

Q: Was this setup functioning properly before and then stopped, or is this a new setup where the issue has just been identified?

A: This is a fresh setup that has exhibited the issue since it's advent.

Q: Do you have any logs that include errors?

A: I can certain look at gathering more logs, but I am unsure what you are looking for. The primary error that I am aware of is the back trace in the initial post which references "adf_vf_resp_wq_ adf_response_handler_wq [intel_qat]". In an effort to aid the investigate though, I have attached an output of "grep -r qat /var/log"

Ronny_G_Intel · ‎02-21-2025

Hi SmithConnect,

Please give me until next Monday to present this case at our QAT engineering meeting, and I will get back to you afterward.

Regards,

Ronny G

SmithConnect · ‎02-21-2025

Ronny,

Thank you for the follow up. Enjoy your weekend.

Regards,

Ronny_G_Intel · ‎02-24-2025

Hi SmithConnect,

After our QAT engineering meeting, I am escalating this issue to our VMware* expert.

I hope to receive a response soon and will update you as soon as I have any news.

I apologize for the delay in resolving this matter.

Regards,

Ronny G

Ronny_G_Intel · ‎02-26-2025

Hi SmithConnect,

The log entry "Workqueue: adf_vf_resp_wq_ adf_response_handler_wq [intel_qat]" is responsible for handling responses from the hardware, which seems normal given the high volume of tasks. However, the context in which it appears in the logs is unclear.

Some recommendations below:

Try increasing the coalescing timer; this may help reduce pressure on the system. I see in your configuration files: InterruptCoalescingTimerNs = 500000, which is expressed in nanoseconds and is equivalent to 0.5 milliseconds.
Is there any particular reason why you are using interrupts instead of polling? Have you tried switching to polling? Refer to section 6.2.2 Polling Functions: Intel QuickAssist Technology Software for Linux Programmer's Guide: https://www.intel.com/content/www/us/en/content-details/710060/intel-quickassist-technology-software-for-linux-programmer-s-guide-customer-enabling-release.html?DocID=710060
Is it possible for you to try more than four vCPUs and check the results?

Regards,

Ronny G

Ronny_G_Intel · ‎03-05-2025

Hi SmithConnect,

Did you have a chance to review my previous post.

Please let me know your comments.

Regards,

Ronny G

Ronny_G_Intel · ‎03-07-2025

Hi SmithConnect,

Do you have any update by any chance?

Regards,

Ronny G

Ronny_G_Intel · ‎03-11-2025

Hi SmithConnect,

Since I haven't heard from you in some time, I'll be closing the internal case we created for this issue. If you still require assistance, please note that I will no longer be monitoring this community. I recommend opening a new case and referencing this one if necessary.

Regards,