Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Integrating IPPI into jasper

dermisch
Beginner
344 Views
I'm currently working on an ECG compression which uses JPEG2000 image compression as a basis.

I've implemented the application with the jasper library. It works but I have performance problems. After profiling I found out that in my case most time in a single function (~ 40 %) is spent on encoding the code blocks (instead of the wavelet transform as i thought before the profiling).

The IPPI provides some entropy encoding/decoding functions but they aren't as easy to integrate as I thought. If someone has experience with integrating the entropy encoder of IPPI into jasper some lines dropped would be very welcome.

regards
Mischan Gholizadeh Toosarani

Message Edited by dermisch on 05-10-2005 09:57 AM

0 Kudos
4 Replies
Mikhail_Kulikov__Int
New Contributor I
344 Views

Hi,

You may start from downloading of JPEG 2000 codec sample from this link:
<http://developer.intel.com/software/products/ipp/jpeg.htm>
(Its placed in common JPEG w_ipp-sample-jpeg_*.zip package, but JPEG 2000 contained in the path .ipp_samplejpegjpeg2000).

Regarding to Jasper, there are some general performance bottlenecks in design (non 16 or 32 byte aligned rows for images, etc.). Some functionality encapsulated in such way that its hard to overcome, because the neighboring interfaces needs to be changed seriously. But its a unique reference code, and a lot of thanks to the authors and committee for its development and publication.

The functionality, which is similar to ippiDecodeCodeblock, is concentrated in Jaspers jpc_dec_decodecblk, but you may see four complex structures in the arguments of Jasper version. Do you have a good documentation about meaning of these fields (that is ringing and doubling substructures sometimes)?
If you track the code you find that only restricted part of their fields are used in the jpc_dec_decodecblk body (and sub-calls), but who is able to produce a complete list of fields? Perhaps if Jasper cares about it, it lists parameters explicitly, right in function declaration. So it looks like good tool to do the job entire, not to substitute or extract some relatively small part (especially to optimize it). It may be possible, but, sure, its not a couple lines of code to write here easily.
Of course, the specific question regarding Jasper may be answered here, if somebody here knows the answer.

IPP library functionality is flexible enough and in the most of cases doesnt restrict application area, but some coding options of IPP codec SAMPLE are restricted by high-level (non-IPP) code lack (read readme.htm for this sample).

But if you have any functionality requests (for IPP functionality as well as for so-called IPP samples) you may submit it through supporting site http://premier.intel.com.
For evaluation version you also may obtain technical support there, but you need to register an account starting from https://registrationcenter.intel.com/EvalCenter/EvalForm.aspx?ProductID=263.

Also track upcoming IPP releases, it promises development in JPEG 2000 area
<http://softwareforums.intel.com/ids/board/message?board.id=IPP&message.id=1904&highlight=%22jpeg2000%22#M1904>

And, sure, in the most of cases IPP JPEG 2000 sample provides faster encoding/decoding, than reference one.

WBR,
Mikhail

0 Kudos
dermisch
Beginner
344 Views
Thank you for that lengthy message.

I wasn't too specific in my first post...

I've already read the code in the JP2 example and started integration before i posted. Also I'm aware of the role the jasper implementation has inside the JP2 community. I can't really use IPPs sample source for my implementation since i have quite special input data not supported by IPP at the moment (for instance grey in 12-bit and above res. per component with a minimum of 2 components) but is no problem in jasper.

I started with "jpc_enc_enccblk" because from my profiling most performance can be gained there (inside it is of course the three encoding passes that need most time). After going through what IPPI has to offer in that area i had the idea that I could replace "jpc_enc_enccblk" with an implementation around the "EncodeLoadCodeBlock" functionality.

Now I can tell you that the ringing and doubling substructures you are talking about are the single biggest problem of my integration. Also there are some datastructures that are inconsistently initialized. Currently I'm documenting what they are doing exactly (which is quite obvious after reading the JP2 docu) and when they are changed (the hard part)!

Also the question arises if it is possible to integrate at all since data structures get updated in the encoding passes too (not only the codestream, passlength and distortion which i can get with or after "EncodeStoreBits").

My questions now are:

What do you mean by "maybe jasper lists is parameters explicitly"? I think this part of the software shows how bad side-effect structured programming is for understanding code. You have no idea what happens to the parameters from reading the function header.

-----

You wrote:
<<<<
So it looks like good tool to do the job entire, not to substitute or extract some relatively small part (especially to optimize it). It may be possible, but, sure, its not a couple lines of code to write here easily.
>>>>

By that do you mean i should do it the way i described above?

----

Should both encoders (IPPI, jasper) produce the same result (given they get the right/same parameters/codeblock data) so i can make some kind of automated test?

----

Also i'm still in the experimenting phase so many errors i get for sure are a result of misusing the interfaces of both libs (e.g. what could be the origin of IPP computing 91 coding passes where jasper computes 25 for exactly the same codeblock?).

Thanks a lot
regards Mischan Gholizadeh Toosarani
0 Kudos
Mikhail_Kulikov__Int
New Contributor I
344 Views
Hi,
yes, side-effect structured programming is suitable term for some of Jasper interfaces, and its one part of thing what I tried to express. Another part is only the assumption that it was not the actual goal (for Jasper development) to provide the strong interface model.
Hope, the following answer helps you in the issue from your experimental code development (I mean difference in the number of passes for Jasper and IPP).
First of all you need to use Complement function from ippIP to convert codeblock data into direct format (Jasper does it a LOT of times during access to specific bit-plane in each pass look into "v = (abs(*(dp)) & (one)) ? 1 : 0;"). It has a value for right calculations of number of passes.
After this if you have different number of passes for encoding by Jasper and by IPP it means that something wrong with non-zero bit-planes placing (I mean shift) and/or magnBits parameter that are passed to the EncodeCodeblock function.
First of all EncodeCodeblock just counts the most significant zero bit planes (MSZBP) starting from 31-bit. Than it sets *sfBits=31-(this number of MSBP) and performs passes starting from cleanup pass for *sfBits - 1 bit-plane till 31 - magnBits bit-plane (zero-based indexing):
*sfBits - 1 : 1 pass : cleanup
*sfBits - 2 : 3 passes : sign. propag. & magn. refin. & cleanup
*sfBits - 3 : 3 passes : sign. propag. & magn. refin. & cleanup
...
31 - magnBits: 3 passes : sign. propag. & magn. refin. & cleanup
So you may count the number of coding passes (*sfBits is returned) from this and compare it with Jasper to find what is actually different. And decide do you need correct magnBits and/or perform shift of data before code-block encoding and/or correct calculation of "number of zero bits" to write to stream. Some combination must help, if non-zero bit-planes are actually the same, of course.
Note, dynamic range for particular WT component and quantization step exponent is not present in IPP interface directly, so you need to care about right calculating of magnBits and/or bit-shift of data before code-block encoding and/or correct calculation of "number of zero bits" to write to stream including adjustment for components.
Hope it will help to solve your problem.
And start from WT53 mode without quantization to make debugging more easily.
WBR,
Mikhail
0 Kudos
dermisch
Beginner
344 Views
I'm using ippiComplement already as I also found out (in addition to being "the right thing") the obvious performance gain you mentioned.

For the rest of the message I just can say: THANK YOU.
This should be added to the documentation (although it may be possible to combine the knowledge about JP2 with IPPI yourself, often these exemplary details can help to find your way out of wrong thinking). I'll try to combine your helpful message with what i've done so far.

Also i now found out that i should have read the jpeg2000 sample thoroughly...

Regards,
Mischan

Message Edited by dermisch on 05-23-2005 01:00 AM

Message Edited by dermisch on 05-25-2005 02:58 AM

0 Kudos
Reply