Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6758 Discussions

Performance testing and scalability analysis of the Intel IPP implementations of the debayer algorithms AN3 and AHD

Intel_C_21
Beginner
632 Views

Hello!

I am doing a postdoc research work related to the quality and performance assessments of various demosaicing algorithms. I'm interesting in evaluating the performance of the IPP ippiCFAToRGB() and ippiDemosaicAHD() demosaicing functionality, and would therefore like to know if and how these two IPP functions exploit the thread-level parallelism. I would like to perform scalability tests with various numbers of threads (1, 2, 4, 8) and obtain how much speedup can be achieved on a given platform.

First of all, I checked to see if both functions are listed in the ThreadedFunctionsList.txt.  I found ippiCFAToRGB() is only there, which probably means that there is not threading support for ippiDemosaicAHD(). Is it so?

Is there a way to configure the number of threads used by Intel IPP and how? I found that the IPP threading control can be established using ippSetNumThreads(n) and ippGetNumThreads(). For me it is unclear if these two functions are still working, because the multithreading version of the new library releases is deprecated. My test results shows that there is no effect when trying to disable internal parallelization with ippSetNumThreads(1).  Also there is not any improvement, if the number of thread is given to be 2, 4 and etc.  

Is it possible somehow to be able to compile the project without optimizations (internal threading turned off) and then with specific CPU optimizations and internal threading turned on?

I have both single-threaded and multi-threaded libraries installed and I tried to compile my test using each one of them. I didn’t see any improvement in the execution times obtained with these versions. Furthermore, when I compiled the project using the multi-threaded library, ippiCFAToRGB() doesn’t produce well reconstructed image. The image is reconstructed by half (the above part contains the successfully reconstructed image and the bottom is black). I suppose that there is a bug. With the single-treaded library everything is OK. With ippiDemosaicAHD() I haven’t  these problems. The raw image file reconstructs properly.

I am using ippIP AVX (e9), 2017.0.1 (r53196) since Oct 4 2016 running on Intel core i5-2400 under Win64.

I hope to receive your assistance. Thanks in advance.

Sincerely yours, Iva

0 Kudos
3 Replies
Andrey_B_Intel
Employee
632 Views

Hi Iva.

How do you link you app with IPP?

Actually, ippiCFAToRGB() has own threading, but ippiDemosaicAHD() does not. You can try to split image on tiles manually and call ippiCFAToRGB or ippiDemosaicAHD.

0 Kudos
Jonghak_K_Intel
Employee
632 Views

Hi Iva,

 

have you considered using VTune amplifier to profile and analyze your test application?

VTune amplifier clearly shows you how threads worked during the tasks and lets you analyze and optimize the performance in much deeper level.

VTune : https://software.intel.com/en-us/intel-vtune-amplifier-xe

 

Additionally, to avoid issues with performance and interoperability with other threading models, Intel IPP's internal threading libraries have been depricated ( https://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-threading-openmp-faq ).

So we incourage you to try external threading such as TBB or OpenMP at the application level.

 

0 Kudos
Intel_C_21
Beginner
632 Views

 

 

 

 

0 Kudos
Reply