Intel® Embree Ray Tracing Kernels
Discussion forum on the open source ray tracing kernels for fast photo-realistic rendering on Intel® CPU(s)
279 Discussions

Scene flags having no effect on memory consumption

Michael_C_9
Beginner
333 Views

The documentation mentions that the accelerated structure scene flags (e.g RTC_SCENE_COMPACT, RTC_SCENE_HIGH_QUALITY) may be ignored by the implementation. I need to reduce the memory consumed by the bounding volume hierarchy but these flags have zero effect on the memory consumed. What exactly determines whether or not they are used?

0 Kudos
5 Replies
BenthinC_Intel
Employee
333 Views

Could you give us some additional info here as RTC_SCENE_COMPACT should definitely reduce memory consumption.

  • Could you pass "verbose=2" to the Embree initialization flags and test whether RTC_SCENE_COMPACT reduces the memory consumption (in the cmd line output look for "used = ... MB").
  • Also there might be a difference in allocated virtual address space and address space really used by Embree?
  • How much memory does your app take in relation to the internal data structures generated by Embree?

A few things to further reduce memory consumption is to share the vertex arrays with the application, use quads instead of triangles (if the geometry is mostly based on quads anyway) etc.  

Hope this helps.

0 Kudos
Michael_C_9
Beginner
333 Views

I'm using user defined geometries. Below I've tested for 1600 objects. Looking at the data below they seem identical regardless of scene flags which shouldn't be the case, right?. 

RTC_SCENE_HIGH_QUALITY

 

 

building BVH4<object> using avx::BVH4BuilderSAH ... [DONE]  1.92308ms, 0.831997 Mprim/s, 0.0481893 GB/s

  primitives = 1600, vertices = 0

  sah = 7.6265 (6.1196 + 1.5068), depth = 6

  used = 0.1 MB, perPrimitive = 57.7 B

  alignedNodes = 621 (89.4% filled) (0.1 MB) (86.1% of total)

  leaves = 1600 (0.0 MB) (13.9% of total)(100.0% used)

  vertices = 0 (0.0 MB) (0.0% of total) (75.0% used)

  allocated = 0.12MB, reserved = 0.12MB, used = 0.09MB (78.10%), wasted = 0.00MB (3.40%), free = 0.00MB (3.40%)

  used blocks = [12288, 16320, 16320] [98304, 102336, 102336] [END]

  free blocks = [END]

created scene intersector

  accels[0]

    intersector1  = avx2::BVH4VirtualIntersector1

    intersector4  = avx2::BVH4VirtualIntersector4Chunk

    intersector8  = avx2::BVH4VirtualIntersector8Chunk

selected scene intersector

  intersector1  = avx2::BVH4VirtualIntersector1

  intersector8  = avx2::BVH4VirtualIntersector8Chunk


RTC_SCENE_COMPACT
 
building BVH4<object> using avx::BVH4BuilderSAH ... [DONE]  2.78401ms, 0.57471 Mprim/s, 0.0332872 GB/s
  primitives = 1600, vertices = 0
  sah = 7.6265 (6.1196 + 1.5068), depth = 6
  used = 0.1 MB, perPrimitive = 57.7 B
  alignedNodes = 621 (89.4% filled) (0.1 MB) (86.1% of total)
  leaves = 1600 (0.0 MB) (13.9% of total)(100.0% used)
  vertices = 0 (0.0 MB) (0.0% of total) (75.0% used)
  allocated = 0.12MB, reserved = 0.12MB, used = 0.09MB (78.10%), wasted = 0.00MB (3.40%), free = 0.00MB (3.40%)
  used blocks = [12288, 16320, 16320] [98304, 102336, 102336] [END]
  free blocks = [END]
created scene intersector
  accels[0]
    intersector1  = avx2::BVH4VirtualIntersector1
    intersector4  = avx2::BVH4VirtualIntersector4Chunk
    intersector8  = avx2::BVH4VirtualIntersector8Chunk
selected scene intersector
  intersector1  = avx2::BVH4VirtualIntersector1
  intersector8  = avx2::BVH4VirtualIntersector8Chunk
 
Scene is created using
 

    RTCSceneFlags sflags = RTC_SCENE_STATIC | RTC_SCENE_HIGH_QUALITY;

    RTCAlgorithmFlags aflags = RTC_INTERSECT1 | RTC_INTERSECT8;

    scene = rtcDeviceNewScene(device, sflags, aflags);

 

 

Other information

 

 

Embree Ray Tracing Kernels 2.9.0 (Mar 10 2016)
  Compiler  : Intel Compiler 16.0.1
  Build     : Release 
  Platform  : Mac OS X (64bit)
  CPU       : Haswell (GenuineIntel)
   Threads  : 4
   ISA      : SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 POPCNT AVX F16C RDRAND AVX2 FMA3 LZCNT BMI1 BMI2 
   Targets  : SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 AVX AVXI AVX2 
   MXCSR    : FTZ=1, DAZ=1
  Config
    Threads : default
    ISA     : SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 POPCNT AVX F16C RDRAND AVX2 FMA3 LZCNT BMI1 BMI2 
    Targets : SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 AVX AVXI AVX2  (supported)
              SSSE3 SSE4.2 AVX AVX2  (compile time enabled)
    Features: intersection_filter bufferstride 
    Tasking : TBB4.3 TBB_header_interface_8001 TBB_lib_interface_9000 

general:
  build threads = 0
  verbosity     = 2
triangles:
  accel         = default
  builder       = default
  traverser     = default
  replications  = 2
motion blur triangles:
  accel         = default
  builder       = default
  traverser     = default
quads:
  accel         = default
  builder       = default
  traverser     = default
motion blur quads:
  accel         = default
  builder       = default
  traverser     = default
line segments:
  accel         = default
  builder       = default
  traverser     = default
motion blur line segments:
  accel         = default
  builder       = default
  traverser     = default
hair:
  accel         = default
  builder       = default
  traverser     = default
  replications  = 3
motion blur hair:
  accel         = default
subdivision surfaces:
  accel         = default
object_accel:
  min_leaf_size = 1
  max_leaf_size = 1
object_accel_mb:
  min_leaf_size = 1
  max_leaf_size = 1

0 Kudos
Michael_C_9
Beginner
333 Views

So it seems like the flags are only for the bounding volume hierarchy generated for meshes? Is there a way to make changes to the scene bounding volume hierarchy and the quality thereof? What is the default quality setting in that case? 

0 Kudos
BenthinC_Intel
Employee
333 Views

Correct, the flags only affect BVHs for meshes. For user geometries BVHs we use the standard binning-based BVH builder as further BVH quality optimizations are very hard to do (we don't know what's in the user geometry). I guess the number of bytes per user geometry is rather small as otherwise the BVH over all user geometries won't be the bottleneck in terms of memory consumption, right?

0 Kudos
Michael_C_9
Beginner
333 Views

The large memory consumption turned out to be an unnecessarily large amount of calls to rtcNewUserGeometry that scaled with the amount of nodes in the BVH which is why I thought that it was related to the BVH.

Thanks for the help!

0 Kudos
Reply