Intel® QuickAssist Technology (Intel® QAT)
For questions and discussions related to Intel® QuickAssist Technology (Intel® QAT).
88 Discussions

RHEL 8.8 KVM guest hangs when running QAT qzip

Karl3
Beginner
2,737 Views

Hello Intel Community, 

 

We have an HPE DL 380 Gen11 server with two Intel Xeon Platinum 8462y processors.

https://www.intel.com/content/www/us/en/products/sku/232383/intel-xeon-platinum-8462y-processor-60m-cache-2-80-ghz/specifications.html

 

We have been testing the Intel QAT feature using the qzip command on RHEL 8.8.

 

The qzip command is working well on the KVM host (hypervisor). We are able to compress/decompress a 3 TB database backup (8 backup files) in 10 mins with 8 parallel qzip commands. If we Ctrl-C or kill a qzip command it terminates immediately.

 

We are having issues on the KVM guest (virtual machine) on the same host. The qzip commands are running successfully and giving the same throughput as on the host however at the start and end of the compress/decompress the guest freezes for several seconds multiple times. In addition, if we Ctrl-C or kill a qzip command the guest freezes for 30 seconds or more, the qzip commands take some time to terminate and we get BUG: soft lockup messages to the console.

 

Message from syslogd@machinev1 at Oct 15 15:20:22 ...
 kernel:watchdog: BUG: soft lockup - CPU#14 stuck for 24s! [qzip:71716]

Message from syslogd@machinev1 at Oct 15 15:20:22 ...
 kernel:watchdog: BUG: soft lockup - CPU#11 stuck for 32s! [qzip:71676]

Message from syslogd@machinev1 at Oct 15 15:20:29 ...
 kernel:watchdog: BUG: soft lockup - CPU#8 stuck for 32s! [qzip:74817]

 

The host has 1 TB of memory. 

The guest has 440 GB of memory.

The guest has 16 CPUs.

The guest has 8 QAT virtual functions defined.

 

Has anyone had any experience with a similar configuation? Are there any known issues with this configuration? Any suggestions?

 

Any advice would be greatly appreciated.

 

Regards, Karl

 

 

0 Kudos
10 Replies
IntelSupport
Community Manager
2,694 Views

Hello Karl,

 

Greetings for the day!

 

This is regarding the case you have with us with the following details.

 

Case No.: 06019423

Product: Intel® QuickAssist Technology (Intel® QAT)

Issue description: QAT qzip execution causes servers stabilty with RHEL 8.8 KVM guest

 

We appreciate your contact with Intel Server Support. We kindly ask for some additional information to better assist you with your inquiry, and we will respond to you as soon as possible.

 

Please don’t hesitate to contact us for any further assistance.

Thank you for using Intel products and services.


0 Kudos
IntelSupport
Community Manager
2,657 Views

Hello Karl,

 

Greetings for the day!

 

This is regarding the case you have logged with us with the following details.

 

Thank you for contacting the Intel Server Support Team. We'd like to inquire about the version of the QAT driver you are currently using. Please confirm this information so that we can provide further assistance.

 

Please don’t hesitate to contact us for any further assistance.

Thank you for using Intel products and services.

 


0 Kudos
Karl3
Beginner
2,645 Views

Hi Intel Support,

 

Thanks for your update. 

 

Is this the correct place to get the driver version.

 

[root@machina ~]# cat /sys/module/qat_4xxx/version
0.6.0
[root@machina ~]# cat /sys/module/intel_qat/version
0.6.0
[root@machina ~]#

 

If not can you advise where I can check this.

 

Many Thanks, 

 

Karl

0 Kudos
Karl3
Beginner
2,586 Views

Hi Intel Support,

 

I have been looking at this issue with our Red Hat. They have said that there is a contention issue on the KVM host level caused by multiple VFs and that QAT device requires vIOMMU. It looks like passing through the device twice: one from Host to Guest, and another from Guest to QAT (user-space driver using vIOMMU). They advised that it would require upstream work to the OS on nested paging to improve this substantially.

 

They suggested enabling iommu pass-through (iommu=pt) on the quest and host. While this has not eliminated the issue completely it has improved it noticeably.

 

They have also suggested using huge pages inside the quest for the QAT processes. Is this possible? I can see some instructions at the link below but we don't have the usdm_drv kernel module on our host. I am not sure how to incorporate this into our installation with the software provided through the standard RHEL repositories.

https://github.com/intel/QATzip/blob/master/README.md#install-qatzip-as-root-user

 

Are you able to provide any assistance with configuring huge pages inside the quest for QAT processes?

 

Regards, Karl

 

 

0 Kudos
IntelSupport
Community Manager
2,538 Views

Hello Karl,

 

Greetings for the day!

 

This is regarding the case you have with us with the following details.

 

The usdm_drv kernel module is part of the QAT driver installation process and is needed for QAT to work properly. To know the QAT driver version, we need to know first if the driver being used is the out-of-tree driver or the in-tree driver. What QAT driver is being used?

 

Please don’t hesitate to contact us for any further assistance.

Thank you for using Intel products and services.


0 Kudos
Karl3
Beginner
2,534 Views

Hi Intel  Support, 

 

Thanks for responding to my last post. 

 

I am using the latest available driver which comes with the Red Hat repositories.

 

# rpm -qa | grep linux-firmware
linux-firmware-20230404-117.git2e92a49f.el8_8.noarch
#

 

How do I know whether I am using in-tree or out-of-tree drivers? 

Also, how do I check the driver version?

I have looked for the  usdm_drv kernel module but I see no sign of this driver on host and VM.

 

# find / -name "*usdm_drv*"
# lsmod | grep usdm_drv
#

 

I followed this document regarding the installation of the drivers.

Ensuring that Intel® QuickAssist Technology stack is working correctly on RHEL

https://access.redhat.com/articles/6376901

 

Regards, Karl

 

 

 

 

0 Kudos
IntelSupport
Community Manager
2,509 Views

Hello Karl,

 

Greetings for the day!

 

We appreciate your patience throughout the case, and we kindly request you to review the details shared below and respond if you require further assistance.

 

Based on the description, the in-tree driver is being used. This driver doesn't use the usdm_drv module. There are two options to follow: (1) Try to increase the amount of memory used by the in-tree driver or (2) use the out-of-tree driver and use huge pages.

 

For (1), the user guide for the in-tree driver can be consulted (refer to the Installation section): https://intel.github.io/quickassist/qatlib/index.html.

 

For (2), make sure the in-tree driver is uninstall first, and then install the out-of-tree driver which is available here: https://www.intel.com/content/www/us/en/download/765501/intel-quickassist-technology-driver-for-linux-hw-version-2-0.html

 

If the preference is to continue working with the in-tree driver (option 1), the same user guide referenced above has instructions to set the driver to work with QATzip. The instructions are different from what it's found in the QATzip GitHub page because the GitHub instructions are intended for the out-of-tree QAT driver.

 

If on the other hand, the preference is to start working with the out-of-tree driver (option 2), the Getting Started Guide and then the QATzip GitHub instructions can be followed to set up QAT. Just make sure the in-tree driver is uninstall first.

 

Please don’t hesitate to contact us for any further assistance.

Thank you for using Intel products and services.


0 Kudos
IntelSupport
Community Manager
2,460 Views

Hello Karl,

 

Greetings for the day!

 

We are currently awaiting your response to assist with the case. Please respond to the community post if you have any questions or need assistance. Your response is valuable to us.

 

Please don’t hesitate to contact us for any further assistance.

Thank you for using Intel products and services.


0 Kudos
IntelSupport
Community Manager
2,430 Views

Hello Karl,


I hope you're having a great day!


We are eagerly anticipating your response to provide assistance with the case. If you have any questions or require support, please reply to the community post. Your input is highly appreciated.


Feel free to reach out to us for any additional help.

Thank you for choosing Intel products and services.


0 Kudos
IntelSupport
Community Manager
2,348 Views

Hello Karl,

 

Greetings for the day!

 

We regret to inform you that due to the lack of response from your end, we will be closing your case. If you still require support, we kindly request you to provide a response for the web post. By doing so, we will be able to either reopen your existing case or create a new one, ensuring that we can continue to provide you with the assistance you need.

 

Please don’t hesitate to contact us for any further assistance.

Thank you for using Intel products and services.


0 Kudos
Reply