Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Question about rendering block size

Kaan_Gök
Beginner
287 Views

I have a simple question for the blockSize parameter on various rendering domain functions.
For example "IntersectEyeSO" function.
The last parameter is IppiSize blockSize, and the documentation says that it's the total number of rays.

Does it make a difference (performancewise) to pass this block as 32x32 or 1024x1 (or 512x2) ?
I'm asking this because the primary rendering is OK for this use (I use 32x32 blocks for image generation), but I also have lots of secondary rays which are used to calculate the global illumination and monet-carlo sampling. Ability to pass arbitrary sizes will be useful. (It works that way, but my concern is a possible speed degradation because of the Nx1 block size).
Thanks in advance

0 Kudos
2 Replies
Gennady_F_Intel
Moderator
287 Views
Quoting - Kaan Gk

I have a simple question for the blockSize parameter on various rendering domain functions.
For example "IntersectEyeSO" function.
The last parameter is IppiSize blockSize, and the documentation says that it's the total number of rays.

Does it make a difference (performancewise) to pass this block as 32x32 or 1024x1 (or 512x2) ?
I'm asking this because the primary rendering is OK for this use (I use 32x32 blocks for image generation), but I also have lots of secondary rays which are used to calculate the global illumination and monet-carlo sampling. Ability to pass arbitrary sizes will be useful. (It works that way, but my concern is a possible speed degradation because of the Nx1 block size).
Thanks in advance


Kaan,
To avoid a performance's degradation the size of block, in both dimensions, should be a multiple of 4.We've managed to optimize "Intesector" functions for such kind of blockSize. So Nx1, 1024x1 or 512x2 are bad sizes for performance.
--Gennady

0 Kudos
Kaan_Gök
Beginner
287 Views
Thank you for the answer, Mr Fedorov.
Multiple of 4 is fine for me, I can modify my code easily to round for the optimal size.
For example I need 753 samples. Instead of using a 753x1 block, I can use a 192x4 block (a total of 768 samples). Which is slightly higher, but OK. (Or maybe 188x4 samples, which make 752, depending on the situation)
0 Kudos
Reply