- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The documentation for ippsDecodeLZ4_8u is straightforward, but the sample code on the page increases the output buffer size by 33 bytes.
The requirement to add 33 bytes beyond what is necessary to hold the decompressed data is not documented for this parameter.
Are the 33 bytes mandatory? If yes, that makes the function (and therefore LZ4) unusable in many scenarios; for example, I decode into a buffer I don't allocate, so I don't control its size. If I have to allocate an intermediate buffer, copy the data, and free the buffer, that certainly negates any performance gain you may achieve by having those 33 bytes.
My concise question is: are the extra 33 bytes necessary?
If so, you should provide a new ippsDecodeLZ4Safe_8u function that never overruns the destination buffer defined by the decompressed data size.
Thanks,
Axel
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thank you for reporting this issue. You are correct that the extra bytes are required. This is because internal SIMD optimizations in our implementation. This requirement is currently not documented in the API reference and it is a gap. We are tracking your feedback and will address the documentation in a future release.
thanks,
Chao
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Again the +33 bytes requirement makes the function unusable in many scenarios.
Allocating an intermediate buffer and copying the data is certain to negate whathever performance gain you get by having those 33 bytes.
Imagine you are decompressing 500 megabytes, how much performance do you gain by decompressing the last few bytes with normal instructions instead of one last SIMD instruction that sometimes overruns? Nothing.
If the memory happens to be at the end of a page the decompression function will fault, which is unacceptable, and again allocating extra bytes just to accommodate for this behavior is often not an option.
The fix is to make sure the function never overruns, not to document that it does.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page