Issue using method ippiDecodeExpGolombOne_H264_1u16s on different CPUs

ch_desjardins · ‎08-04-2011

Hi,

We are using the ippiDecodeExpGolombOne_H264_1u16s(...) method when parsing H.264 streams in our product. It seems that a recent update to Intel IPP 7.0 Update 4 breaks the behavior of this method on some CPUs.

We were able to see that, for example,the method reacts properly for on an Intel Xeon E5507 CPU, while the exact same code on the same stream returns a different value for both the Intel Core 2 X6800 and Quad Q6600 CPUs. The ippiDecodeExpGolombOne_H264_1u32s(...) method has a proper behavior for all CPUs.

Intel IPP libraries are linked statically using *_l.lib (single-threaded) version of the libraries. Libraries are from 7.0 Update 4, with the installer being w_ipp_7.0.4.196.exe.

I have a sample project (source is attached) in VS2010 clearly illustrating the differences when compiled in 'Debug' mode and then run on the different machines/CPUs (result of the method call is different for the same input stream). You can view attached snapshots for details on the output of this sample program on different CPUs.

We noticed that when this sample project is compiled in 'Release', it now works properly on all CPUs. However, calling the ippiDecodeExpGolombOne_H264_1u16s(...) method from our product compiled in 'Release' does not work. We suspect some linking/optimization flag during the build could explain this discrepancy with the 'sample project', but this is actually hard to identify.

Can you reproduce this issue? Do you have any additional information regarding what could happen? We will probably resort to using the *_32s(...) version of the method, but are wondering if the same issue could happen in some cases?

I can provide additional information if needed - let me know.

Best regards,
Charles

ch_desjardins · ‎08-15-2011

* BUMP *

We would definitely expect these libraries to display the same behavior no matter what CPU platform the code is running on.

By the way, the _32s version of the method performs correctly on all CPUs tested. We suspect some sort of 'overflow' issue on some platforms...

Thanks.
Charles

vcherepa · ‎09-01-2011

Good day,
it would be very useful, if you attached a exe file built on your side. We could look into asm code.
Preliminary reason of such a strange function behaviour is stack problem on the application side. But to make it clear we needyour exe file.

Best regards,
Victor

ch_desjardins · ‎09-01-2011

Hi Victor,

Here is the executable for the test application I provided earlier. It was compiled in 'Debug' using Visual Studio 2010. The behavior does not happen with this test app when compiled in 'Release'. However, in our product compiled in 'Release', we can reproduce the error. Thus, we suspect some optimization flag or compilation switch that could cause the issue? However, this is quite hard to identify...

Let me know if you need additional information.

Best regards,
Charles