- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Intel Community,
We have an HPE DL 380 Gen11 server with two Intel Xeon Platinum 8462y processors.
We have been using the Intel Quick Assist (QAT) qzip (QAT zip) command to compress large backup files.
We have been compressing 12 X 275 GB database backup files in parallel. This usually completes as 12 compresses in about 24 minutes.
Occasionally (approx 1 time in 20) the compression command creates a compressed file that can not be decompressed by qzip (QAT zip). The compressed file appears to be a valid archive because it can be decompressed by gzip (GNU zip) which results in a decompressed file that is identical to the original file.
The qzip (QAT zip) compress command is as follows:
qzip -k database.bak.5 -O gzipext -A deflate -L 6 > database.bak.5.qzip.comp.log 2> database.bak.5.qzip.comp.err
This creates a database.bak.5.gz file.
The qzip (QAT zip) decompress command is as follows:
qzip -d -k database.bak.5.gz -O gzipext -A deflate > database.bak.5.qzip.decomp.log 2> database.bak.5.qzip.decomp.err
This is the error is generated by this decompress command.
doProcessBuffer:Decompression failed with error: -2
Process file error: -2
This error happens consistently for this the database.bak.5.gz file. It results in a truncated version of the original database.bak.5 file.
Note: That we have successfully compressed and decompressed the exact same database.bak.5 file 19 times successfully.
Note: That the database.bak.5.gz file can be decompressed by gzip (GNU zip) which results in the exact same file as the original.
Note: We also have a RHEL 8.8 server were we see the same issue.
The processor formware is Server Platform Services (SPS) Firmware 6.0.4.75.0
The RHEL kernel is 4.18.0-513.18.1.el8_9.x86_64.
The Quick Assist packages are:
# rpm -qa | grep linux-firmware
linux-firmware-20230824-120.git0e048b06.el8_9.noarch
#rpm -qa | grep qat | sort
qatlib-23.02.0-2.el8_8.x86_64
qatlib-service-23.02.0-2.el8_8.x86_64
qatzip-1.1.2-1.el8_8.x86_64
#
What does this error mean? Is this a known issue? Do you have any advise on resolving this issue?
Any advice would be greatly appreciated.
Regards, Karl
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl,
I am currently looking into this issue but I will need some extra time to do some more research about this error when executing the decompress command:
doProcessBuffer:Decompression failed with error: -2
Process file error: -2
I checked our internal records and found no previous reports, and the fact that it happens approximately 1 time in 20 attempts makes it very difficult to replicate and troubleshoot.
I would assume that you using QATzip (https://github.com/intel/QATzip) and that you followed installation instructions and any applicable recommendations, please confirm and provide any other pertinent detail.
I will get back to you as soon as possible.
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny,
Thanks for responding to my post.
I used the qzip executable as provided through the Red Hat repositories.
I just installed and executed the qzip command like any other executable provided by Red Hat.
# yum install linux-firmware
# yum install qatlib qatzip
We have some compressed files that consistently fail with the error - however, the smallest file is 143 GB. I tried reproducing with smaller files but I couldn't reproduce the issue.
Regards, Karl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl3,
Can you plaese check what version of QATzip are you using? 
I know it may be difficult to test, but can you test separately with the latest QATzip from GitHub? Here is the URL: https://github.com/intel/QATzip
On the other hand, would you be willing to provide us with the file that cannot be decompressed? With Compress and Verify (CnV) this should not be possible as we are decompressing each block after compressing to ensure we can decompress the data. It would also be interesting to see if see any CnV errors being logged. We can check this with command similar to: cat /sys/kernel/debug/qat_4xxx_0000\:09\:00.0/cnv_errors (update the BDF to align with QAT endpoint on your system).
Thanks,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny G,
We have RHEL 8.8 and RHEL 8.9 installed and the QATzip version is v1.1.2 on both.
$ qzip -V
qzip v1.1.2
Copyright (C) 2021 Intel Corporation.
$
I have a case open with Red Hat. They have provided a copy of the RHEL 9.4.0 beta qatlib/qatlib-service/qatzip packages to install on RHEL8 to try out. qzip -V also returns v1.1.2 for this installation. These RHEL 9.4.0 beta packages also produce the same error.
I have had a look at the QATzip instructions from Git Hub. They look complex and I don't have the expertise to work out what is applicable on a RHEL8 system. I will ask Red Hat support to review and see if they can provide any guidance.
I checked the /sys/kernel/debug/qat_4xxx_0000* directories on the server and I don't see any cnv_errors files. Under that directory, I just see dev_cfg and config files.
Regarding the file that can not be decompressed - is it possible for you to provide something to gather diagnostics from the file rather than uploading the file?
Regards, Karl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl,
Thanks for the additional details and I understand that you dont want try QATzip from GitHub.
Allow me a couple of days to do some more research and I will get back to you.
Thanks,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny G,
Are you able to direct message me?
Regards, Karl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny G,
We can upload an example file that fails to decompress. The file will be 135 GB.
Please direct message me with secure SFTP credentials for this case so I can upload the file.
Regards, Karl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl,
We can do email or web ticketing (similar to direct message).
I will send you a quick note to initiative conversation.
Thanks,
Ronny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is great. Thank you.
Let me find the best way to receive your upload. I will get back to you soon.
Thanks,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny G,
It has been a week now. Any news on these upload instructions?
Regards, Karl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl,
I apologize for the delay.
I submitted an internal ticket for a temporary SFTP to be created for this purpose and this request is still not complete but I am doing the correspondent follow up on it.
On the other hand, you mentioned that you have tested RHEL 8.8 and RHEL 8.9 and also RHEL 9.4.0 beta, are you testing using different hardware or is it the same server platform?
In all cases, you are using QATzip version v1.1.2, we still believe that testing the latest QATzip version 1.2.0 from https://github.com/intel/QATzip/tree/QATzip_1.2.0_release may be very helpful in debugging this issue and finally, you mentioned that this error happens consistently for database.bak.5.gz file. Have you tried other files is do you still get error? I am trying to understand if this error is data dependent, application or hardware related.
Thanks,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny G,
We have two of these servers:
HPE DL 380 Gen11 server with two Intel Xeon Platinum 8462y processors.
One is running RHEL 8.8 and one is running RHEL 8.9. The same issue has occurred on both servers.
We have 12 backup files all with different data. To generate an error I perform a parallel compress and then decompress of all 12 backup files. If we cycle this 20 times with the same files (240 compress and decompress operations) the decompress typically fails a couple of times. It is never consistently the same file which fails to decompress. The same files are being compressed and decompressed each time so it doesn't appear to be data related to me.
Please direct message me when you have the SFTP details. The 135 GB file I upload will decompress to a 275 GB file via gunzip.
Regards, Karl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl3,
Thank you for providing more details. I've just dispatched a brief email to you containing the SFTP instructions for uploading the file.
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny G,
Red Hat provided the latest upstream qat-service, qatzip (qzip 1.2) and qatlib from the GitHub for me to test on RHEL 8.9.
I tried to decompress one of the files which qzip 1.12 failed to decompress. I was able to decompress the file using the latest qzip 1.2.
However, when I did the parallel compress and decompress with qzip 1.2 I got a similar issue. Two files failed to decompress with qzip 1.2 but I was able to decompress these files with gzip.
The error message was similar.
SW deComp fallback failure! compress fatal ERROR!
doProcessBuffer:Decompression failed with error: -2
Process file error: -2
It looks like the upstream versions of these packages have the same issue.
Regards, Karl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl,
My apologies for the delay in responding to you. Coordinating with our IT department to establish an SFTP for external use took some time.
I sent you an email with the instructions for you to have access to our SFTP. Additionally, I have initiated a password reset request on your behalf. You should receive an email shortly with instructions on how to reset your password.
Once you have uploaded the files for debugging, kindly drop me a message to confirm.
Best regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl,
I have the file that you uploaded to the our SFTP and I am currently checking with an internal expert.
I hope this wont take long and I will get back to you as soon as possible.
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl,
We are looking at the data provided and have the following request:
- Can you provide the first 192KB of uncompressed data?
- Can you provide the first 192KB or so of working compressed file?
We can see an unexpected behavior with early parts of the file. These additional items will greatly assist in the debug effort.
- Can you verify the following statement that you made a while ago: "The database.bak.5.gz file can be decompressed by gzip (GNU zip) which results in the exact same file as the original"
- Does this mean the compressed file (150+ GB file) that you cannot decompress with qzip, you are able to decompress this with gzip? This is a key question here. If this is true, that means qatzip is not generated uncompressible data, rather it is generating data that just our hardware is not able to decompress.
- Can you try recreating the issue using QAT level 1 compression? There are several differences between Level 1 and the higher levels.. so understanding if level 1 is also affected will provide us clearer picture. Here is more information about compression levels, https://github.com/intel/QATzip (refer to Additional Information section)
- Can you confirm also that CnV is enabled?
CnV is always enabled via the compression APIs (cpaDcCompressData(), cpaDcCompressData2(), cpaDcNsCompressData(), cpaDcDpEnqueueOp()).
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Karl,
I believe we have identified the root cause of the issue as a software decompression bug within the qzip utility. The problem arises when QZ_DATA_ERROR is returned after several successful executions of the doProcessBuffer() function, leading to an incorrect bytes_processed count. Consequently, the data input into the qzDecompress call lacks a valid GZip header, resulting in the error. It's important to note that this bug is not related to hardware compression/decompression and does not pose a risk of user data loss.
Efforts are underway to prepare a new software release that will resolve this issue. The release is tentatively scheduled for Q4 this year, although this timeline may be subject to adjustments.
Please let me know if you are okay with this and if this case can be closed at this point.
Regards,
Ronny G
 
					
				
				
			
		
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page