Intel® Embree Ray Tracing Kernels
Discussion forum on the open source ray tracing kernels for fast photo-realistic rendering on Intel® CPU(s)
282 Discussions

Applications scaled with rtcIntersect4/8/16 and Importance of Scheduler

I do have few questions on the performance and schedulers part of the application, and was checking more on the usage and parallelism for rtcIntersect4/8/16 and rtcOccluded4/8/16 and went through few comments

1. The default examples (i.e, pathtracer, motionblur) present in `Embree` and does not seem to utilize rtcIntersect4/8/16 and rtcOccluded4/8/16 are there possible performance improvement if the rays are coherent across the packet? Please do provide some examples which takes advantages of these functions 

2. Are there other provisions to process multiple pixels together to scale the performance while rendering ?

3. The default task scheduler is tbb and most of the time is spent on task stitching(synchronization), are there better ways to take advantage of SIMD usage across.
4. Most of the application is restricted to Vec2 & Vec3 formats (within Ray Tracer Core) which sticks to128-bit vectors. Few other Rendering applications are using 256-bit formats
0 Kudos
1 Reply



Regarding 1.) Which mode of rendering (ray1, ray4, ray8, ray16) will be faster depends on a lot of parameters so please experiment with the different ray packet sizes. However for simple "primary hit" use-cases, ray packets are expected to perform better (usually the wider the better). However not all workloads can easily use these ray packets without code modifications. For example, a "mega-kernel"-style path tracer (such as the Embree path tracer tutorial) usually uses a single-ray model and using ray packets requires modifying the path tracer to a "wave-front"-style.


Regarding 2.) That's a very generic rendering design question and not easy to answer from the Embree perspective.


Regarding 3.) This should not be the case for real life workloads. And TBB has nothing to do with SIMD. Embree's internal implementation uses SIMD instructions for efficient BVH traversal, for example, but this is orthogonal to the tasking system used by the application.


4.) Again very generic. Embree tries to be flexible enough to handle most use-cases. For example one could also SIMDfy by using a spectral rendering application that uses 4-wide vectors for the spectral resolution (per path/ray). So SIMDfying over rays is not always the best/only option.



Embree Team

0 Kudos