Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
6814 Discussions

Further Optimizing 256x256 Tiling and Multithreading (OpenMP) for Large Image Processing with Intel

klay1252
Beginner
606 Views

Hello Intel IPP experts,

I am currently working on processing large images using Intel Integrated Performance Primitives (IPP) with a 256x256 tiling approach. This method has proven to be faster than processing the entire image at once. In addition to this, I have implemented multithreading using OpenMP to parallelize the processing of tiles.

While the performance is already improved, I am looking for ways to optimize it even further. Specifically, I would appreciate any advice or best practices on how to achieve better performance in the following areas:

  1. Optimizing cache usage: Are there specific techniques or IPP functions that can help optimize cache utilization when processing multiple 256x256 tiles, especially in a multithreaded environment?
  2. Reducing tile processing overhead: Are there advanced techniques to minimize overhead related to managing tile boundaries or transitions between tiles, even when using multithreading?
  3. Optimizing OpenMP parallelization: Are there specific ways to fine-tune OpenMP usage or IPP functions to further enhance parallel performance when processing image tiles?
  4. Additional IPP optimizations: Any other suggestions for maximizing the performance of large image processing, specifically when handling image tiles in combination with multithreading?

If there are any C++ code examples or references to further optimize this approach with IPP and OpenMP, that would be very helpful.

Thanks in advance for your insights and suggestions!

0 Kudos
1 Solution
Ruqiu_C_Intel
Moderator
462 Views

In general, IPP supports safety single thread​. Users take care of their multiple threads running in parallels, which means users take of OpenMP scheduling.


Several years ago, IPP introduced Integration Wrappers, which were designed to improve user experience with threading of Intel IPP functions and tiling. The Integration Wrappers document and examples are list in IPP install folder ./components/interfaces/iw/. Hopefully it helps you.


View solution in original post

1 Reply
Ruqiu_C_Intel
Moderator
463 Views

In general, IPP supports safety single thread​. Users take care of their multiple threads running in parallels, which means users take of OpenMP scheduling.


Several years ago, IPP introduced Integration Wrappers, which were designed to improve user experience with threading of Intel IPP functions and tiling. The Integration Wrappers document and examples are list in IPP install folder ./components/interfaces/iw/. Hopefully it helps you.


Reply