- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are focusing in decompression functionality provided by bzip2 library. After intergration of IPP features into original bzip2, we encountered 20% performance boost on OSX and 30% on Linux in our decompression benchmarks. Great job, guys!
However, we have found several issues with current bzip2 IPP implementation. Please find brief description below.
Further I refer to bzip2 optimization patches with IPP functions provided in the following locations in the installed IPP library:
- compilers_and_libraries_2019.0.117/mac/ipp/examples/components_and_examples_osx_ps.tgz:components/interfaces/ipp_bzip2/
compilers_and_libraries_2019.1.144/linux/ipp/examples/components_and_examples_lin_iss.tgz:components/interfaces/ipp_bzip2/
The problems:
- Patched makefiles are incorrect
On Linux, the makefiles Makefile and Makefile-libbz2_so patched with provided bzip2-1.0.6.linux.patch reset IPP_LDFLAGS and CFLAGS variables, however the provided readme.html instructs to set these to point to Intel IPP library locations before building. With both variables correctly set, building bzip2 with these makefiles produces result with no IPP features included. In order to fix it, one needs to change patched makefiles as follows:#IPP_LDFLAGS= CFLAGS+=-Wall -Winline -O2 -g $(BIGFILES)
After doing this, all works as expected.
These same applies to Makefile on OSX. Inconsistency with behaviour of original BZ2_bzDecompress()
The BZ2_bzDecompress() in patched library returns BZ_DATA_ERROR if provided output buffer's size is lesser than size of plaintext that has been decompressed by current run by that time. Original behaviour is to fill the output buffer with as much plaintext as it fits in the output buffer and return BZ_OK in this case. Note also that BZ_DATA_ERROR is meant to indicate data integrity errors detected in the compressed stream. To fix this, one needs to uncomment the lineif ( 0 == strm->avail_out ) return BZ_OK;
in function copy_from_int_d_buf() in patched bzlib.c.
Crash due to do "pointer being freed was not allocated" in BZ2_bzDecompressEnd()
With original bzip2 library, calling BZ2_bzDecompressEnd() on the object initialized by BZ2_bzDecompressInit() without any decompression happening in-between is absolutely legal, as the former is meant to clean up all resources allocated by the latter. Doing so with patched library leads to a crash. The problem lies in patched BZ2_bzDecompressEnd() which attempts toBZFREE( ((int_d_state*)strm->state)->mt_states );
where, in fact, mt_states is never NULL but has never been allocated nor by BZ2_bzDecompressInit() nor by BZ2_bzDecompress() in this case. To fix this, one needs to explicitly set
s->mt_states = NULL;
in BZ2_bzDecompressInit(), and, change the resource freeing logic in BZ2_bzDecompressEnd() to the following
int BZ_API(BZ2_bzDecompressEnd) ( bz_stream *strm ) { #if defined(WITH_IPP) if ( NULL == strm ) return BZ_PARAM_ERROR; if ( NULL == strm->state ) return BZ_PARAM_ERROR; if ( NULL != ((int_d_state*)strm->state)->mt_states ) BZFREE( ((int_d_state*)strm->state)->mt_states ); BZFREE( strm->state ); strm->state = NULL; #else ...
Note that this code excerpt above, comparing to one patched with IPP, does not return BZ_PARAM_ERROR if the mt_states is NULL, as it might happen it is NULL if no real decompression has happened in-between BZ2_bzDecompressInit() and BZ2_bzDecompressEnd() calls for any reason. One of these, is trivial error case when input buffer contains not-bzip2 formatted data, and BZ2_bzDecompress() rightly fails to process it and returns BZ_DATA_ERROR.
- Tags:
- Development Tools
- General Support
- Intel® Integrated Performance Primitives
- Parallel Computing
- Vectorization
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks Mihhail, we will check and fix these issues into one of the next updates/releases and will keep you updated on this matter.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you!
Meanwhile, we have found that patched bzip2 fails to decompress some files produced with lbzip2 utility. I have attached 2 .bz2 files produced by lbzip2-compressing machine-generated apache log of different length, as follows:
head -10342 apache.log | /usr/local/bin/lbzip2 -c > apache_ipp_ok.lbzip2_.bz2 head -10343 apache.log | /usr/local/bin/lbzip2 -c > apache_ipp_err.lbzip2_.bz2
Now, on OSX:
$ ./bzip2-ipp -k -c -d apache_ipp_ok.lbzip2_.bz2 | wc -l 10342 $ ./bzip2-ipp -k -c -d apache_ipp_err.lbzip2_.bz2 | wc -l bzip2-ipp: PANIC -- internal consistency error: decompress:unexpected error This is a BUG. Please report it to me at: jseward@bzip.org Input file = apache_ipp_err.lbzip2_.bz2, output file = (stdout) 0 $ /usr/bin/bzip2 -k -c -d apache_ipp_err.lbzip2_.bz2 | wc -l 10343
and on Linux:
$ ./bzip2-ipp -k -c -d apache_ipp_ok.lbzip2_.bz2 | wc -l 10342 $ ./bzip2-ipp -k -c -d apache_ipp_err.lbzip2_.bz2 | wc -l bzip2-ipp: Caught a SIGSEGV or SIGBUS whilst decompressing. .... 0 $ /bin/bzip2 -k -c -d apache_ipp_err.lbzip2_.bz2 | wc -l 10343
As much as we were able to trace, in case of decompressing apache_ipp_err.lbzip2_.bz2, function ippsDecodeBlock_BZ2_16u8u() returns -2 ("Unknown/unspecified error") in function decode_block(), which in it's turn returns 'false' to calling method BZ2_bzDecompress().
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, we will investigate the problems in the nearest time.
In parallel may I ask you: why do you choose bzip2 algorithm? It is just my interest to understand why people choose one or another algorithm.
Pavel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pavel!
Glad to hear that things move fast - most of our clients are also Intel clients. We have just tested IPP patch for zlib, the integration went very smoothly, and we have observed increase in decompression performance by nearly 30% on OSX and 40% on Linux with Intel CPUs. This is awesome! So we look forward to bzip2 IPP patch becoming as usable as zlib's one :)
Regarding the bzip2 as an algorithm, it is a given for us, as many of our clients at SpectX use it. Despite the fact that bzip2 is loads CPU intensive than other widespread compression algorithms, it provides very high compression ratio, and, which is most valuable in our case - it allows for paralell processing. Our raw log analytics product allows users to skip the import/ingestion and query their data in distibuted manner in its original location and format. The product's processing performance is said to be pretty insane (on avg, 350 MB/sec per CPU core for uncompressed data). That is, the overall throughput depends on how many cores they are willing to allocate for it to use. This is why your work at Intel on speeding up decompression is critical for us.
Bon weekend,
Mihhail
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mihhail,
Thank you for findings! We fixed the problem with failing decompression on your bz2 file. And, we'll correct patch file and readme document according to your 1..3 suggestions.
The fixes will be available in the next IPP update planned for the 1-st half of next year.
Regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
Checked today, there is Update 2 (versions 2019.2.184 for OSX and 2019.2.187 for Linux) available for download.
Does it contain the fixes yet?
Release notes for it just again say that bzip2 patch has been added but do not mention any fixes.
Best regards,
Mihhail
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mihhail,
2019u2 release was a technical update to fix problems in installer programs. New Bzip2 patch file and release notes will appear in 2019 update 3.
Regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mihhail,
I'm working on optimizing bzip2 performance with intel ipp libraries on Android. I finished porting the patch and building. I want to characterize the performance of bzip2 with and without ipp optimizations. Can you please share what decompression benchmarks you had used ?
Also, bzip2 -v <filename> prints the compression speed.
Is there similar option with decompression, bzip2 -d -v ? How did you measure decompression time with bzip2 ? Any help would be highly appreciated.
Thanks In Advance!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am afraid I cannot help you neither with bzip2 tool nor benchmarks. As a matter of fact, we do not use bzip2 utility as such; we use ipp-patched bzip2 library in our products. The benchmarks we run are our proprietary ones.
Best regards,
Mihhail
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
According to readme ($IPPROOT/components/components/interfaces/ipp_bzip2/readme.html) to build 32-bit version of libbz2.so we should do:
export CFLAGS="-m32 -DWITH_IPP -I$IPPROOT/include"
export IPP_LDFLAGS="$IPPROOT/lib/ia32/libippdc.a $IPPROOT/lib/ia32/libipps.a $IPPROOT/lib/ia32/libippcore.a"
make -f Makefile-libbz2_so
So for building 64-bit shared library I used the following commands:
source $IPPROOT/bin/ippvars.sh intel64
patch -p1 < $IPPROOT/components/components/interfaces/ipp_bzip2/bzip2-1.0.8.patch.bin
export CFLAGS="-m64 -DWITH_IPP -I$IPPROOT/include"
export IPP_LDFLAGS="$IPPROOT/lib/intel64/libippdc.a $IPPROOT/lib/intel64/libipps.a $IPPROOT/lib/intel64/libippcore.a"
make -f Makefile-libbz2_so
The above commands successfully built libbz2.so but without IPP.
Thanks to Mihhail's post, I managed to build libbz2.so with IPP.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page