Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Intel® IPP 2019 Update 1 - bzip2 optimization patch has problems

Mihhail
Novice
3,037 Views

We are focusing in decompression functionality provided by bzip2 library. After intergration of IPP features into original bzip2, we encountered 20% performance boost on OSX and 30% on Linux in our decompression benchmarks. Great job, guys!

However, we have found several issues with current bzip2 IPP implementation. Please find brief description below.


Further I refer to bzip2 optimization patches with IPP functions provided in the following locations in the installed IPP library:

  • compilers_and_libraries_2019.0.117/mac/ipp/examples/components_and_examples_osx_ps.tgz:components/interfaces/ipp_bzip2/
  • compilers_and_libraries_2019.1.144/linux/ipp/examples/components_and_examples_lin_iss.tgz:components/interfaces/ipp_bzip2/

The problems:

  1. Patched makefiles are incorrect
    On Linux, the makefiles Makefile and Makefile-libbz2_so patched with provided bzip2-1.0.6.linux.patch reset IPP_LDFLAGS and CFLAGS variables, however the provided readme.html instructs to set these to point to Intel IPP library locations before building. With both variables correctly set, building bzip2 with these makefiles produces result with no IPP features included. In order to fix it, one needs to change patched makefiles as follows:
    #IPP_LDFLAGS=
    CFLAGS+=-Wall -Winline -O2 -g $(BIGFILES)
    

    After doing this, all works as expected.
    These same applies to Makefile on OSX.

  2. Inconsistency with behaviour of original BZ2_bzDecompress()
    The BZ2_bzDecompress() in patched library returns BZ_DATA_ERROR if provided output buffer's size is lesser than size of plaintext that has been decompressed by current run by that time. Original behaviour is to fill the output buffer with as much plaintext as it fits in the output buffer and return BZ_OK in this case. Note also that BZ_DATA_ERROR is meant to indicate data integrity errors detected in the compressed stream. To fix this, one needs to uncomment the line 

    if ( 0 == strm->avail_out ) return BZ_OK;

    in function copy_from_int_d_buf() in patched bzlib.c.

  3. Crash due to do "pointer being freed was not allocated" in BZ2_bzDecompressEnd()
    With original bzip2 library, calling BZ2_bzDecompressEnd() on the object initialized by BZ2_bzDecompressInit() without any decompression happening in-between is absolutely legal, as the former is meant to clean up all resources allocated by the latter. Doing so with patched library leads to a crash. The problem lies in patched BZ2_bzDecompressEnd() which attempts to

    BZFREE( ((int_d_state*)strm->state)->mt_states );
    

    where, in fact, mt_states is never NULL but has never been allocated nor by BZ2_bzDecompressInit() nor by BZ2_bzDecompress() in this case. To fix this, one needs to explicitly set

    s->mt_states    =  NULL;
    

    in BZ2_bzDecompressInit(), and, change the resource freeing logic in BZ2_bzDecompressEnd() to the following

    int BZ_API(BZ2_bzDecompressEnd)  ( bz_stream *strm )
    {
    #if defined(WITH_IPP)
        if ( NULL == strm )                                   return BZ_PARAM_ERROR;
        if ( NULL == strm->state )                            return BZ_PARAM_ERROR;
        if ( NULL != ((int_d_state*)strm->state)->mt_states ) 
            BZFREE( ((int_d_state*)strm->state)->mt_states );
        BZFREE( strm->state );
        strm->state = NULL;
    #else
    ...
    
    

    Note that this code excerpt above, comparing to one patched with IPP, does not return BZ_PARAM_ERROR if the mt_states is NULL, as it might happen it is NULL if no real decompression has happened in-between BZ2_bzDecompressInit() and BZ2_bzDecompressEnd() calls for any reason. One of these, is trivial error case when input buffer contains not-bzip2 formatted data, and BZ2_bzDecompress() rightly fails to process it and returns BZ_DATA_ERROR.

10 Replies
Gennady_F_Intel
Moderator
3,037 Views

thanks Mihhail, we will check and fix these issues into one of the next updates/releases and will keep you updated on this matter.

0 Kudos
Mihhail
Novice
3,037 Views

Thank you!
Meanwhile, we have found that patched bzip2 fails to decompress some files produced with lbzip2 utility. I have attached 2 .bz2 files produced by lbzip2-compressing machine-generated apache log of different length, as follows:

head -10342 apache.log | /usr/local/bin/lbzip2 -c > apache_ipp_ok.lbzip2_.bz2 
head -10343 apache.log | /usr/local/bin/lbzip2 -c > apache_ipp_err.lbzip2_.bz2

Now, on OSX:

$ ./bzip2-ipp -k -c -d apache_ipp_ok.lbzip2_.bz2 | wc -l
   10342
$ ./bzip2-ipp -k -c -d apache_ipp_err.lbzip2_.bz2 | wc -l

bzip2-ipp: PANIC -- internal consistency error:
	decompress:unexpected error
	This is a BUG.  Please report it to me at:
	jseward@bzip.org
	Input file = apache_ipp_err.lbzip2_.bz2, output file = (stdout)
   0
$ /usr/bin/bzip2 -k -c -d apache_ipp_err.lbzip2_.bz2 | wc -l
   10343

and on Linux:

$ ./bzip2-ipp -k -c -d apache_ipp_ok.lbzip2_.bz2 | wc -l
10342
$ ./bzip2-ipp -k -c -d apache_ipp_err.lbzip2_.bz2 | wc -l

bzip2-ipp: Caught a SIGSEGV or SIGBUS whilst decompressing.
  ....
0
$ /bin/bzip2 -k -c -d apache_ipp_err.lbzip2_.bz2 | wc -l
10343

As much as we were able to trace, in case of decompressing apache_ipp_err.lbzip2_.bz2, function ippsDecodeBlock_BZ2_16u8u() returns -2 ("Unknown/unspecified error") in function decode_block(), which in it's turn returns 'false' to calling method BZ2_bzDecompress().

0 Kudos
Pavel_B_Intel1
Employee
3,037 Views

Thank you, we will investigate the problems in the nearest time. 

In parallel may I ask you: why do you choose bzip2 algorithm? It is just my interest to understand why people choose one or another algorithm.

Pavel

0 Kudos
Mihhail
Novice
3,037 Views

Hi Pavel!

Glad to hear that things move fast - most of our clients are also Intel clients. We have just tested IPP patch for zlib, the integration went very smoothly, and we have observed increase in decompression performance by nearly 30% on OSX and 40% on Linux with Intel CPUs. This is awesome! So we look forward to bzip2 IPP patch becoming as usable as zlib's one :)

Regarding the bzip2 as an algorithm, it is a given for us, as many of our clients at SpectX use it. Despite the fact that bzip2 is loads CPU intensive than other widespread compression algorithms, it provides very high compression ratio, and, which is most valuable in our case - it allows for paralell processing. Our raw log analytics product allows users to skip the import/ingestion and query their data in distibuted manner in its original location and format. The product's processing performance is said to be pretty insane (on avg, 350 MB/sec per CPU core for uncompressed data). That is, the overall throughput depends on how many cores they are willing to allocate for it to use. This is why your work at Intel on speeding up decompression is critical for us.

Bon weekend,
Mihhail

0 Kudos
Sergey_K_Intel
Employee
3,037 Views

Hi Mihhail,

Thank you for findings! We fixed the problem with failing decompression on your bz2 file. And, we'll correct patch file and readme document according to your 1..3 suggestions.

The fixes will be available in the next IPP update planned for the 1-st half of next year.

Regards,
Sergey

0 Kudos
Mihhail
Novice
3,037 Views

Hello!

Checked today, there is Update 2 (versions 2019.2.184 for OSX and 2019.2.187 for Linux) available for download.
Does it contain the fixes yet?

Release notes for it just again say that bzip2 patch has been added but do not mention any fixes.

Best regards,
Mihhail

0 Kudos
Sergey_K_Intel
Employee
3,037 Views

Hi Mihhail,

2019u2 release was a technical update to fix problems in installer programs. New Bzip2 patch file and release notes will appear in 2019 update 3.

Regards,
Sergey

0 Kudos
ShaliniSal_B_Intel
3,037 Views

Hi Mihhail,

I'm working on optimizing bzip2 performance with intel ipp libraries on Android. I finished porting the patch and building. I want to characterize the performance of bzip2 with and without ipp optimizations. Can you please share what decompression benchmarks you had used ? 

Also, bzip2 -v <filename> prints the compression speed. 

Is there similar option with decompression, bzip2 -d -v ? How did you measure decompression time with bzip2 ? Any help would be highly appreciated.

Thanks In Advance!!

0 Kudos
Mihhail
Novice
3,037 Views

Hi,

I am afraid I cannot help you neither with bzip2 tool nor benchmarks. As a matter of fact, we do not use bzip2 utility as such; we use ipp-patched bzip2 library in our products. The benchmarks we run are our proprietary ones.

Best regards,
Mihhail

0 Kudos
Valentyn
Novice
2,912 Views

According to readme ($IPPROOT/components/components/interfaces/ipp_bzip2/readme.html) to build 32-bit version of libbz2.so we should do:

export CFLAGS="-m32 -DWITH_IPP -I$IPPROOT/include"
export IPP_LDFLAGS="$IPPROOT/lib/ia32/libippdc.a $IPPROOT/lib/ia32/libipps.a $IPPROOT/lib/ia32/libippcore.a"
make -f Makefile-libbz2_so

So for building 64-bit shared library I used the following commands:

source $IPPROOT/bin/ippvars.sh intel64
patch -p1 < $IPPROOT/components/components/interfaces/ipp_bzip2/bzip2-1.0.8.patch.bin
export CFLAGS="-m64 -DWITH_IPP -I$IPPROOT/include"
export IPP_LDFLAGS="$IPPROOT/lib/intel64/libippdc.a $IPPROOT/lib/intel64/libipps.a $IPPROOT/lib/intel64/libippcore.a"
make -f Makefile-libbz2_so

The above commands successfully built libbz2.so but without IPP.

Thanks to Mihhail's post, I managed to build libbz2.so with IPP.

0 Kudos
Reply