Intel® QuickAssist Technology (Intel® QAT)
For questions and discussions related to Intel® QuickAssist Technology (Intel® QAT).
94 Discussions

use Qat4xxx, my demo can run,but ceph cant

yulongshi0816
Beginner
429 Views

I ported the QAT compression code from Ceph version 17 into a demo program. When running it, I encountered the error: Error userStarMultiProcess(-1), switch to SW if permitted g_process.qz_init_status = QZ_NO_HW.

I resolved this issue in the demo by either:

  1. Setting export QAT_SECTION_NAME="SSL", which allowed the demo to run normally, or

  2. Changing the section name from SSL to SHIM in my 4xxx_dev0.conf file, which also made it work.

However, when I enable QAT compression in Ceph, it fails with the following errors:

ADF_UIO_PROXY err: get_bundle_from_dev_cached: failed to open uio dev /dev/uio4
[error] SalCtrl_ServiceInit() - : Failed to initialise all service instances
ADF_UIO_PROXY err: adf_user_subsystemInit: Failed to initialise Subservice SAL
[error] SalCtrl_ServiceEventStart() - : Private data is NULL
ADF_UIO_PROXY err: adf_user_subsystemStart: Failed to start Subservice SAL
ADF_UIO_PROXY err: get_bundle_from_dev_cached: failed to open uio dev /dev/uio12
[error] SalCtrl_ServiceInit() - : Failed to initialise all service instances
ADF_UIO_PROXY err: adf_user_subsystemInit: Failed to initialise Subservice SAL
[error] SalCtrl_ServiceEventStart() - : Private data is NULL
ADF_UIO_PROXY err: adf_user_subsystemStart: Failed to start Subservice SAL
[error] SalCtrl_AdfServicesStartedCheck() - : Sal Ctrl failed to start in given time

[error] do_userStart() - : Failed to start services

ADF_UIO_PROXY err: icp_adf_subsystemUnregister: Failed to shutdown subservice SAL.
ADF_UIO_PROXY err: icp_adf_subsystemUnregister: Failed to shutdown subservice SAL.
Error userStarMultiProcess(-1), switch to SW if permitted

 

0 Kudos
5 Replies
DiegoV_Intel
Moderator
384 Views

Hi,


Are you following any specific guide for this? If so, can you please share the link here?


I can take a look and see if I can provide any suggestions for you.


Regards,

Diego V.


0 Kudos
yulongshi0816
Beginner
361 Views

Thank you for reply:

I follow the guide:https://www.intel.com/content/www/us/en/content-details/632506/intel-quickassist-technology-intel-qat-software-for-linux-getting-started-guide-hardware-version-2-0.html
and follow the guide from qatzip-1.1.2,README
https://github.com/intel/QATzip/blob/v1.1.2/README.md#build-intel-quickassist-technology-driver

Now I have replaced the 4xxx_dev0.conf from the QATzip 1.1.2 version into the /etc directory, and I can run the program normally. However, when I enable compression in Ceph and upload a single object, there is no problem. But when I use FIO to upload objects, I encounter the following timeout errors:
-66> 2025-11-27T09:06:01.058+0800 7fa2b2bfb640 0 bluestore(/var/lib/ceph/osd/ceph-1) log_latency slow operation observed for compress@_do_alloc_write, latency = 6.407702446s
-65> 2025-11-27T09:06:01.058+0800 7fa2b33fc640 0 bluestore(/var/lib/ceph/osd/ceph-1) log_latency slow operation observed for compress@_do_alloc_write, latency = 6.407705784s
-64> 2025-11-27T09:06:01.058+0800 7fa2b77dc640 0 bluestore(/var/lib/ceph/osd/ceph-1) log_latency slow operation observed for compress@_do_alloc_write, latency = 6.407700539s
-63> 2025-11-27T09:06:01.058+0800 7fa2b8fdf640 0 bluestore(/var/lib/ceph/osd/ceph-1) log_latency slow operation observed for compress@_do_alloc_write, latency = 6.407549381s
-62> 2025-11-27T09:06:01.058+0800 7fa2b4bff640 0 bluestore(/var/lib/ceph/osd/ceph-1) log_latency slow operation observed for compress@_do_alloc_write, latency = 6.407548904s
-61> 2025-11-27T09:06:01.059+0800 7fa2b43fe640 0 bluestore(/var/lib/ceph/osd/ceph-1) log_latency slow operation observed for compress@_do_alloc_write, latency = 6.407576561s
Then a segmentation fault occurs:

1: /usr/lib64/libc.so.6(+0x40ef0) [0x7fa2edbe2ef0]
 2: /usr/lib64/libc.so.6(+0x88280) [0x7fa2edc2a280]
 3: pthread_mutex_lock()
 4: qzInit()
 5: qzCompressCrcExt()
 6: qzCompressExt()
 7: qzCompress()
 8: (QatAccel::compress(ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list&, boost::optional<int>&)+0xce) [0x7fa2e6cb6c6e]
 9: (BlueStore::_do_alloc_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>, boost::intrusive_ptr<BlueStore::Onode>&, BlueStore::WriteContext*)+0x4f8) [0x557ea26ee6e8]
 10: (BlueStore::_do_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int)+0x2c5) [0x557ea2742055]
 11: (BlueStore::_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int)+0x8e) [0x557ea2742bee]
 12: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x1ba0) [0x557ea2746230]

 

0 Kudos
DiegoV_Intel
Moderator
320 Views

Hi,

I found this Ceph Configuration Guide for Intel QAT in the Ceph documentation. Perhaps you already found it too.

I haven't followed this setup before, but from the documentation I see some key elements that you should pay attention to. Let me list them below:

  1. The first step is to confirm the Intel QAT out-of-tree driver and the Intel QATZip library are working properly. From your description, it looks like this is already done but I'm including the information here just in case there is something you missed.
    1. Follow the Getting Started Guide to install and test the Intel QAT out-of-tree driver. This process will be successful if you are able to run the cpa_sample_code described in the guide.
    2. Once the sample code is working, proceed to setup the Intel QATZip library. Note that the current latest version is 1.3.1 but you are using 1.1.2. Make sure to install the library from the Master branch to always use the latest version. This process will be successful if you are able to run the run_perf_test.sh script referenced in the installation instructions.
      • Important: To properly setup the Intel QATZip library, you have to update the QAT configuration file using one of the sample files available in the library. This is because the [SHIM] section has to be added. This configuration will be updated later when setting up Ceph.
  2. The next step, once the Intel QAT out-of-tree driver and the Intel QATZip library are up and running, is to configure Intel QAT in Ceph.
    1. The QAT configuration file for all devices should be updated with a new user process instance for [CEPH]. The Ceph guide above has an example to use.
    2. To support Intel QAT compression, Ceph must be built with the following options: ./do_cmake.sh -DWITH_QATDRV=ON -DWITH_QATZIP=ON -DWITH_SYSTEM_QATZIP=ON -DWITH_QATLIB=OFF
    3. The [CEPH] user process instance in the QAT configuration files must be clarified in an environment variable: export QAT_SECTION_NAME=CEPH
    4. Update the ceph.conf file to enable Intel QAT compression. Refer to the Ceph guide above for details.

According to the Ceph documentation, this should be the required configuration to make Intel QAT compression using QATZip work with Ceph. Can you please verify and confirm you are following this setup correctly?

I can help review your QAT configuration files if you can provide them, as well as the ceph.conf file.

Please note my expertise and support coverage is specifically for Intel QAT components - namely the QAT driver and QATzip library. While I can help ensure these components are properly configured and functioning, any issues related to Ceph's integration, configuration, or behavior fall outside my support scope and would need to be addressed through Ceph community channels. That said, I'm happy to help verify that the QAT side of the integration is set up correctly, as this often resolves integration issues.

Regards,
Diego V.

0 Kudos
yulongshi0816
Beginner
291 Views

Thank you very much for your reply again.

Currently, I am using Ceph version 16. The operation guide for this version does not include settings like "out-of-tree"; during compilation, it only requires adding -DWITH_QAT. During my testing, using a single thread with rados bench to write 4MB data does not cause any issues. However, as soon as I increase the number of threads, the previous core dump occurs.

I have reviewed the Ceph code and found that during compression, Ceph creates new temporary objects, which also means creating new QAT sessions. Could the cause of this issue be related to having too many sessions? I am not entirely sure, but I noticed that in later versions of Ceph, session pool management was added for handling sessions. Could having too many sessions potentially lead to this problem? 
Attachment is my QAT configuration. Thank you!

0 Kudos
DiegoV_Intel
Moderator
249 Views

Hi,

Thanks for the extra details.

What you just described sounds exactly like the reason for the failure: too many sessions being created simultaneously.

Something you can try to see if there is any difference is using a multiple thread optimized configuration rather than a multiple process optimized configuration.

Your current QAT configuration is set for multiple process optimization, where you have the maximum possible processes but only one single data compression instance. Based on your description, it sounds more like you have multiple threads within one or few processes.

A sample for multiple thread optimization is available here: https://github.com/intel/QATzip/tree/master/config_file/4xxx/multiple_thread_opt. Give it a try with this configuration for all QAT devices.

If this configuration still doesn't resolve the issue, I would recommend considering an upgrade to a newer Ceph version that includes session pool management. The Ceph documentation referenced above mentions specific configuration settings to define the maximum number of sessions. The fact that these settings are documented suggests that your current issue was common enough to motivate the addition of session handling at the Ceph level for QAT compression with QATzip.

Hope this helps.

Regards,
Diego V.

0 Kudos
Reply