Intel® QuickAssist Technology (Intel® QAT)
For questions and discussions related to Intel® QuickAssist Technology (Intel® QAT).
41 Discussions

Quick Assist QAT qzip decompress fails for a qzip compressed file

Karl3
Beginner
1,293 Views

Hi Intel Community,

 

We have an HPE DL 380 Gen11 server with two Intel Xeon Platinum 8462y processors.

https://www.intel.com/content/www/us/en/products/sku/232383/intel-xeon-platinum-8462y-processor-60m-cache-2-80-ghz/specifications.html

 

We have been using the Intel Quick Assist (QAT) qzip (QAT zip) command to compress large backup files. 

 

We have been compressing 12 X 275 GB database backup files in parallel. This usually completes as 12 compresses in about 24 minutes.

 

Occasionally (approx 1 time in 20) the compression command creates a compressed file that can not be decompressed by qzip (QAT zip). The compressed file appears to be a valid archive because it can be decompressed by gzip (GNU zip) which results in a decompressed file that is identical to the original file.

 

The qzip (QAT zip) compress command is as follows:

 

qzip -k database.bak.5 -O gzipext -A deflate -L 6 > database.bak.5.qzip.comp.log 2> database.bak.5.qzip.comp.err

 

This creates a database.bak.5.gz file.

 

The qzip (QAT zip) decompress command is as follows:

qzip -d -k database.bak.5.gz -O gzipext -A deflate > database.bak.5.qzip.decomp.log 2> database.bak.5.qzip.decomp.err

 

This is the error is generated by this decompress command.

doProcessBuffer:Decompression failed with error: -2
Process file error: -2

 

This error happens consistently for this the database.bak.5.gz file. It results in a truncated version of the original database.bak.5 file.

 

Note: That we have successfully compressed and decompressed the exact same database.bak.5 file 19 times successfully.
Note: That the database.bak.5.gz file can be decompressed by gzip (GNU zip) which results in the exact same file as the original.
Note: We also have a RHEL 8.8 server were we see the same issue.

 

The processor formware is Server Platform Services (SPS) Firmware 6.0.4.75.0
The RHEL kernel is 4.18.0-513.18.1.el8_9.x86_64.

The Quick Assist packages are:

# rpm -qa | grep linux-firmware
linux-firmware-20230824-120.git0e048b06.el8_9.noarch

#rpm -qa | grep qat | sort
qatlib-23.02.0-2.el8_8.x86_64
qatlib-service-23.02.0-2.el8_8.x86_64
qatzip-1.1.2-1.el8_8.x86_64
#

 

What does this error mean? Is this a known issue? Do you have any advise on resolving this issue?

 

Any advice would be greatly appreciated.

 

Regards, Karl

0 Kudos
17 Replies
Ronny_G_Intel
Moderator
1,260 Views

Hi Karl,

 

I am currently looking into this issue but I will need some extra time to do some more research about this error when executing the decompress command:

 

doProcessBuffer:Decompression failed with error: -2

Process file error: -2

 

I checked our internal records and found no previous reports, and the fact that it happens approximately 1 time in 20 attempts makes it very difficult to replicate and troubleshoot.

I would assume that you using QATzip (https://github.com/intel/QATzip) and that you followed installation instructions and any applicable recommendations, please confirm and provide any other pertinent detail.

 

I will get back to you as soon as possible.

 

Regards,

Ronny G

0 Kudos
Karl3
Beginner
1,254 Views

Hi Ronny, 

 

Thanks for responding to my post.

 

I used the qzip executable as provided through the Red Hat repositories.

 

I just installed and executed the qzip command like any other executable provided by Red Hat.

 

# yum install linux-firmware

# yum install qatlib qatzip

 

We have some compressed files that consistently fail with the error - however, the smallest file is 143 GB. I tried reproducing with smaller files but I couldn't reproduce the issue.

 

Regards, Karl

 

0 Kudos
Ronny_G_Intel
Moderator
1,176 Views

Hi Karl3,


Can you plaese check what version of QATzip are you using?
I know it may be difficult to test, but can you test separately with the latest QATzip from GitHub? Here is the URL: https://github.com/intel/QATzip

On the other hand, would you be willing to provide us with the file that cannot be decompressed? With Compress and Verify (CnV) this should not be possible as we are decompressing each block after compressing to ensure we can decompress the data. It would also be interesting to see if see any CnV errors being logged. We can check this with command similar to: cat /sys/kernel/debug/qat_4xxx_0000\:09\:00.0/cnv_errors (update the BDF to align with QAT endpoint on your system).

Thanks,
Ronny G

0 Kudos
Karl3
Beginner
1,154 Views

Hi Ronny G, 

We have RHEL 8.8 and RHEL 8.9 installed and the QATzip version is v1.1.2 on both.

$ qzip -V
qzip v1.1.2
Copyright (C) 2021 Intel Corporation.

$

I have a case open with Red Hat. They have provided a copy of the RHEL 9.4.0 beta qatlib/qatlib-service/qatzip packages to install on RHEL8 to try out. qzip -V also returns v1.1.2 for this installation. These RHEL 9.4.0 beta packages also produce the same error. 

I have had a look at the QATzip instructions from Git Hub. They look complex and I don't have the expertise to work out what is applicable on a RHEL8 system. I will ask Red Hat support to review and see if they can provide any guidance. 

I checked the /sys/kernel/debug/qat_4xxx_0000* directories on the server and I don't see any cnv_errors files. Under that directory, I just see dev_cfg and config files.

Regarding the file that can not be decompressed - is it possible for you to provide something to gather diagnostics from the file rather than uploading the file? 

Regards, Karl

 

 

0 Kudos
Ronny_G_Intel
Moderator
1,127 Views

Hi Karl,


Thanks for the additional details and I understand that you dont want try QATzip from GitHub.

Allow me a couple of days to do some more research and I will get back to you.


Thanks,

Ronny G


0 Kudos
Karl3
Beginner
1,104 Views

Hi Ronny G, 

 

Are you able to direct message me?

 

Regards, Karl

0 Kudos
Karl3
Beginner
1,066 Views

Hi Ronny G,

 

We can upload an example file that fails to decompress. The file will be 135 GB.

 

Please direct message me with secure SFTP credentials for this case so I can upload the file.

 

Regards, Karl

 

0 Kudos
Ronny_G_Intel
Moderator
1,058 Views

Hi Karl,

We can do email or web ticketing (similar to direct message).

I will send you a quick note to initiative conversation.

 

Thanks,

Ronny 

 

0 Kudos
Ronny_G_Intel
Moderator
1,058 Views

This is great. Thank you.

Let me find the best way to receive your upload. I will get back to you soon.


Thanks,

Ronny G


0 Kudos
Karl3
Beginner
931 Views

Hi Ronny G, 

 

It has been a week now. Any news on these upload instructions?

 

Regards, Karl

0 Kudos
Ronny_G_Intel
Moderator
899 Views

Hi Karl,


I apologize for the delay.

I submitted an internal ticket for a temporary SFTP to be created for this purpose and this request is still not complete but I am doing the correspondent follow up on it.


On the other hand, you mentioned that you have tested RHEL 8.8 and RHEL 8.9 and also RHEL 9.4.0 beta, are you testing using different hardware or is it the same server platform?


In all cases, you are using QATzip version v1.1.2, we still believe that testing the latest QATzip version 1.2.0 from https://github.com/intel/QATzip/tree/QATzip_1.2.0_release may be very helpful in debugging this issue and finally, you mentioned that this error happens consistently for database.bak.5.gz file. Have you tried other files is do you still get error? I am trying to understand if this error is data dependent, application or hardware related.


Thanks,

Ronny G 


0 Kudos
Karl3
Beginner
891 Views

Hi Ronny G,

 

We have two of these servers:

 

HPE DL 380 Gen11 server with two Intel Xeon Platinum 8462y processors.

https://www.intel.com/content/www/us/en/products/sku/232383/intel-xeon-platinum-8462y-processor-60m-cache-2-80-ghz/specifications.html

 

One is running RHEL 8.8 and one is running RHEL 8.9. The same issue has occurred on both servers.

 

We have 12 backup files all with different data. To generate an error I perform a parallel compress and then decompress of all 12 backup files. If we cycle this 20 times with the same files (240 compress and decompress operations)  the decompress typically fails a couple of times. It is never consistently the same file which fails to decompress. The same files are being compressed and decompressed each time so it doesn't appear to be data related to me. 

 

Please direct message me when you have the SFTP details. The 135 GB file I upload will decompress to a 275 GB file via gunzip.

 

Regards, Karl

 

0 Kudos
Ronny_G_Intel
Moderator
849 Views

Hi Karl3,


Thank you for providing more details. I've just dispatched a brief email to you containing the SFTP instructions for uploading the file.


Regards,

Ronny G


0 Kudos
Karl3
Beginner
813 Views

Hi Ronny G,

 

Red Hat provided the latest upstream qat-service, qatzip (qzip 1.2) and qatlib from the GitHub for me to test on RHEL 8.9. 

 

I tried to decompress one of the files which qzip 1.12 failed to decompress. I was able to decompress the file using the latest qzip 1.2. 

 

However, when I did the parallel compress and decompress with qzip 1.2 I got a similar issue. Two files failed to decompress with qzip 1.2 but I was able to decompress these files with gzip. 

 

The error message was similar. 

SW deComp fallback failure! compress fatal ERROR!
doProcessBuffer:Decompression failed with error: -2
Process file error: -2

 

It looks like the upstream versions of these packages have the same issue.

 

Regards, Karl

 

 

 

0 Kudos
Ronny_G_Intel
Moderator
520 Views

Hi Karl,


My apologies for the delay in responding to you. Coordinating with our IT department to establish an SFTP for external use took some time.


I sent you an email with the instructions for you to have access to our SFTP. Additionally, I have initiated a password reset request on your behalf. You should receive an email shortly with instructions on how to reset your password.


Once you have uploaded the files for debugging, kindly drop me a message to confirm.


Best regards,


0 Kudos
Ronny_G_Intel
Moderator
383 Views

Hi Karl,


I have the file that you uploaded to the our SFTP and I am currently checking with an internal expert.

I hope this wont take long and I will get back to you as soon as possible.


Regards,

Ronny G


0 Kudos
Ronny_G_Intel
Moderator
169 Views

Hi Karl,

 

We are looking at the data provided and have the following request:

  • Can you provide the first 192KB of uncompressed data?
  • Can you provide the first 192KB or so of working compressed file?

We can see an unexpected behavior with early parts of the file. These additional items will greatly assist in the debug effort.

  • Can you verify the following statement that you made a while ago: "The database.bak.5.gz file can be decompressed by gzip (GNU zip) which results in the exact same file as the original"
  • Does this mean the compressed file (150+ GB file) that you cannot decompress with qzip, you are able to decompress this with gzip? This is a key question here. If this is true, that means qatzip is not generated uncompressible data, rather it is generating data that just our hardware is not able to decompress.
  • Can you try recreating the issue using QAT level 1 compression? There are several differences between Level 1 and the higher levels.. so understanding if level 1 is also affected will provide us clearer picture. Here is more information about compression levels, https://github.com/intel/QATzip (refer to Additional Information section)
  • Can you confirm also that CnV is enabled?  

CnV is always enabled via the compression APIs (cpaDcCompressData(), cpaDcCompressData2(), cpaDcNsCompressData(), cpaDcDpEnqueueOp()).

 

Regards,

Ronny G

 

0 Kudos
Reply