<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi Carsten, in Intel® Embree Ray Tracing Kernels</title>
    <link>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132358#M666</link>
    <description>&lt;P&gt;Hi Carsten,&lt;/P&gt;

&lt;P&gt;Thank you for the quick response. I commented out the asserts in the create(), setChildren(), and setBounds() functions so my code is no longer breaking when creating the leaf nodes. I also modified sah() to account for 4 children instead of 2, however it looks like not all inner nodes are being created with 4 children which is causing sah() to throw an exception and halt the program. Is there not supposed to be 4 children created at every inner node? Is this a result of my BuildSettings? Also how do you determine an appropriate amount of extraSpace and sahBlockSize?&lt;/P&gt;

&lt;P&gt;Thank you for you help, here is my bvh_builder_device.cpp file:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;// ======================================================================== //
// Copyright 2009-2018 Intel Corporation                                    //
//                                                                          //
// Licensed under the Apache License, Version 2.0 (the "License");          //
// you may not use this file except in compliance with the License.         //
// You may obtain a copy of the License at                                  //
//                                                                          //
//     &lt;A href="http://www.apache.org/licenses/LICENSE-2.0" target="_blank"&gt;http://www.apache.org/licenses/LICENSE-2.0&lt;/A&gt;                           //
//                                                                          //
// Unless required by applicable law or agreed to in writing, software      //
// distributed under the License is distributed on an "AS IS" BASIS,        //
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. //
// See the License for the specific language governing permissions and      //
// limitations under the License.                                           //
// ======================================================================== //

#include "../common/tutorial/tutorial_device.h"

namespace embree
{	
  RTCScene g_scene  = nullptr;

  /* This function is called by the builder to signal progress and to
   * report memory consumption. */
  bool memoryMonitor(void* userPtr, ssize_t bytes, bool post) {
    return true;
  }

  bool buildProgress (void* userPtr, double f) {
    return true;
  }

  void splitPrimitive (const RTCBuildPrimitive* prim, unsigned int dim, float pos, RTCBounds* lprim, RTCBounds* rprim, void* userPtr)
  {
    assert(dim &amp;lt; 3);
    assert(prim-&amp;gt;geomID == 0);
    *(BBox3fa*) lprim = *(BBox3fa*) prim;
    *(BBox3fa*) rprim = *(BBox3fa*) prim;
    (&amp;amp;lprim-&amp;gt;upper_x)[dim] = pos;
    (&amp;amp;rprim-&amp;gt;lower_x)[dim] = pos;
  }

  struct Node
  {
    virtual float sah() = 0;
	unsigned int nodeType;
  };

  struct InnerNode : public Node
  {
    BBox3fa bounds[4];
    Node* children[4];

    InnerNode() {
      bounds[0] = empty; 
	  bounds[1] = empty;
	  bounds[3] = empty; 
	  bounds[4] = empty;
	  children[0] = nullptr; 
	  children[1] = nullptr; 
	  children[2] = nullptr; 
	  children[3] = nullptr;
	  nodeType = 1;
    }

    float sah() {
      return 1.0f + (area(bounds[0])*children[0]-&amp;gt;sah() + area(bounds[1])*children[1]-&amp;gt;sah() + area(bounds[2])*children[2]-&amp;gt;sah() + area(bounds[3])*children[3]-&amp;gt;sah())/area(merge(bounds[0],bounds[1],bounds[2],bounds[3]));
    }

    static void* create (RTCThreadLocalAllocator alloc, unsigned int numChildren, void* userPtr)
    {
      //assert(numChildren == 2);
      void* ptr = rtcThreadLocalAlloc(alloc,sizeof(InnerNode),16);
      return (void*) new (ptr) InnerNode;
    }

    static void  setChildren (void* nodePtr, void** childPtr, unsigned int numChildren, void* userPtr)
    {
      //assert(numChildren == 2);
      for (size_t i=0; i&amp;lt;numChildren; i++)
        ((InnerNode*)nodePtr)-&amp;gt;children&lt;I&gt; = (Node*) childPtr&lt;I&gt;;
    }

    static void  setBounds (void* nodePtr, const RTCBounds** bounds, unsigned int numChildren, void* userPtr)
    {
      //assert(numChildren == 2);
      for (size_t i=0; i&amp;lt;numChildren; i++)
        ((InnerNode*)nodePtr)-&amp;gt;bounds&lt;I&gt; = *(const BBox3fa*) bounds&lt;I&gt;;
    }
  };
  //leaf node seems to be everything on the bottom/lowest nodes/triangles
  struct LeafNode : public Node
  {
    unsigned id;
    BBox3fa bounds;

	LeafNode(unsigned id_new, const BBox3fa&amp;amp; bounds_new) {
		id = id_new;
		bounds = bounds_new;
		nodeType = 2;
	}

    float sah() {
      return 1.0f;
    }

    static void* create (RTCThreadLocalAllocator alloc, const RTCBuildPrimitive* prims, size_t numPrims, void* userPtr)
    {
      assert(numPrims == 1);
      void* ptr = rtcThreadLocalAlloc(alloc,sizeof(LeafNode),16);
      return (void*) new (ptr) LeafNode(prims-&amp;gt;primID,*(BBox3fa*)prims);
    }
  };
  

  void build(RTCBuildQuality quality, avector&amp;lt;RTCBuildPrimitive&amp;gt;&amp;amp; prims_i, char* cfg, size_t extraSpace = 0)
  {
    rtcSetDeviceMemoryMonitorFunction(g_device,memoryMonitor,nullptr);

    RTCBVH bvh = rtcNewBVH(g_device);

    avector&amp;lt;RTCBuildPrimitive&amp;gt; prims;
    prims.reserve(prims_i.size()+extraSpace);
    prims.resize(prims_i.size());

    /* settings for BVH build */
    RTCBuildArguments arguments = rtcDefaultBuildArguments();
    arguments.byteSize = sizeof(arguments);
    arguments.buildFlags = RTC_BUILD_FLAG_DYNAMIC;
    arguments.buildQuality = quality;
    arguments.maxBranchingFactor = 4;
    arguments.maxDepth = 1024;
    arguments.sahBlockSize = 2;
    arguments.minLeafSize = 1;
    arguments.maxLeafSize = 1;
    arguments.traversalCost = 1.0f;
    arguments.intersectionCost = 1.0f;
    arguments.bvh = bvh;
    arguments.primitives = prims.data();
    arguments.primitiveCount = prims.size();
    arguments.primitiveArrayCapacity = prims.capacity();
    arguments.createNode = InnerNode::create;
    arguments.setNodeChildren = InnerNode::setChildren;
    arguments.setNodeBounds = InnerNode::setBounds;
    arguments.createLeaf = LeafNode::create;
    arguments.splitPrimitive = splitPrimitive;
    arguments.buildProgress = buildProgress;
    arguments.userPtr = nullptr;
    

	/* we recreate the prims array here, as the builders modify this array */
	for (size_t j = 0; j &amp;lt; prims.size(); j++) {
		prims&lt;J&gt; = prims_i&lt;J&gt;;
		
	}

    std::cout &amp;lt;&amp;lt; "Building BVH over " &amp;lt;&amp;lt; prims.size() &amp;lt;&amp;lt; " primitives, " &amp;lt;&amp;lt; std::flush;
    double t0 = getSeconds();
    Node* root = (Node*) rtcBuildBVH(&amp;amp;arguments);
    double t1 = getSeconds();
    const float sah = root ? root-&amp;gt;sah() : 0.0f;
    std::cout &amp;lt;&amp;lt; 1000.0f*(t1-t0) &amp;lt;&amp;lt; "ms, " &amp;lt;&amp;lt; 1E-6*double(prims.size())/(t1-t0) &amp;lt;&amp;lt; " Mprims/s, sah = " &amp;lt;&amp;lt; sah &amp;lt;&amp;lt; " [DONE]" &amp;lt;&amp;lt; std::endl;


    rtcReleaseBVH(bvh);
  }

  /* called by the C++ code for initialization */
  extern "C" void device_init (char* cfg)
  {
    /* set start render mode */
    renderTile = renderTileStandard;

    /* create random bounding boxes */
	const size_t N = 200;
    const size_t extraSpace = 1000;
    
    avector&amp;lt;RTCBuildPrimitive&amp;gt; prims;
    prims.resize(N);

	/*	Create primitives	*/
    for (size_t i=0; i&amp;lt;N; i++)
    {
      const float x = float(drand48());
      const float y = float(drand48());
      const float z = float(drand48());
      const Vec3fa p = 1000.0f*Vec3fa(x,y,z);
      const BBox3fa b = BBox3fa(p,p+Vec3fa(1.0f));

      RTCBuildPrimitive prim;
      prim.lower_x = b.lower.x;
      prim.lower_y = b.lower.y;
      prim.lower_z = b.lower.z;
      prim.geomID = 0;
      prim.upper_x = b.upper.x;
      prim.upper_y = b.upper.y;
      prim.upper_z = b.upper.z;
      prim.primID = (unsigned) i;
      prims&lt;I&gt; = prim;
    }

	/*	only want to test high quality build
    std::cout &amp;lt;&amp;lt; "Low quality BVH build:" &amp;lt;&amp;lt; std::endl;
    build(RTC_BUILD_QUALITY_LOW,prims,cfg);

    std::cout &amp;lt;&amp;lt; "Normal quality BVH build:" &amp;lt;&amp;lt; std::endl;
    build(RTC_BUILD_QUALITY_MEDIUM,prims,cfg);
	*/
    std::cout &amp;lt;&amp;lt; "High quality BVH build:" &amp;lt;&amp;lt; std::endl;
    build(RTC_BUILD_QUALITY_HIGH,prims,cfg,extraSpace);
  }

  /* task that renders a single screen tile */
  void renderTileStandard(int taskIndex, int threadIndex, int* pixels,
                          const unsigned int width,
                          const unsigned int height,
                          const float time,
                          const ISPCCamera&amp;amp; camera,
                          const int numTilesX,
                          const int numTilesY)
  {
  }

  /* task that renders a single screen tile */
  void renderTileTask(int taskIndex, int threadIndex, int* pixels,
                      const unsigned int width,
                      const unsigned int height,
                      const float time,
                      const ISPCCamera&amp;amp; camera,
                      const int numTilesX,
                      const int numTilesY)
  {
  }

  /* called by the C++ code to render */
  extern "C" void device_render (int* pixels,
                                 const int width,
                                 const int height,
                                 const float time,
                                 const ISPCCamera&amp;amp; camera)
  {
  }

  /* called by the C++ code for cleanup */
  extern "C" void device_cleanup () {
  }
}
&lt;/I&gt;&lt;/J&gt;&lt;/J&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;-Evan&lt;/P&gt;</description>
    <pubDate>Wed, 30 May 2018 15:27:12 GMT</pubDate>
    <dc:creator>Waxman__Evan</dc:creator>
    <dc:date>2018-05-30T15:27:12Z</dc:date>
    <item>
      <title>Building BVH4 or BVH8</title>
      <link>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132356#M664</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I am trying to create a High Quality BVH4 structure using the BVH builder tutorial. I am aware that it is creating a BVH2 structure and I tried modifying a few arguments to accomplish this task. MaxBranchingFactor within the RTCBuildArguments was changed to a value of 4. Within the InnerNode stuct, I also increased the number of bounds and children to 4. When calling setChildren and setBounds, I made sure that it would loop through all 4 children and bounds instead of 2 as it was before. However, it seems to break when creating the leaf nodes. Could anyone point me in the right direction on how to create a BVH4/8 structure using the BVH builder tutorial?&lt;/P&gt;

&lt;P&gt;Thank you,&lt;/P&gt;

&lt;P&gt;Evan&lt;/P&gt;</description>
      <pubDate>Wed, 30 May 2018 00:36:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132356#M664</guid>
      <dc:creator>Waxman__Evan</dc:creator>
      <dc:date>2018-05-30T00:36:17Z</dc:date>
    </item>
    <item>
      <title>Hi Evan,</title>
      <link>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132357#M665</link>
      <description>&lt;P&gt;Hi Evan,&lt;/P&gt;

&lt;P&gt;could you post or PM your code so that I can have a look? Do you get a segfault when creating the leaf nodes? For a high quality BVH you also need to reserve some extra space (BuildSettings.extraSpace) to store spatial splits. Other than that one needs to modify a couple of BuildSettings entries like sahBlockSize, maxBranchingFactor, minLeafSize, maxLeafSize, quality etc to get a good high-quality n-wide BVH, but that should be it.&lt;/P&gt;

&lt;P&gt;Carsten.&amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 30 May 2018 15:03:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132357#M665</guid>
      <dc:creator>BenthinC_Intel</dc:creator>
      <dc:date>2018-05-30T15:03:37Z</dc:date>
    </item>
    <item>
      <title>Hi Carsten,</title>
      <link>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132358#M666</link>
      <description>&lt;P&gt;Hi Carsten,&lt;/P&gt;

&lt;P&gt;Thank you for the quick response. I commented out the asserts in the create(), setChildren(), and setBounds() functions so my code is no longer breaking when creating the leaf nodes. I also modified sah() to account for 4 children instead of 2, however it looks like not all inner nodes are being created with 4 children which is causing sah() to throw an exception and halt the program. Is there not supposed to be 4 children created at every inner node? Is this a result of my BuildSettings? Also how do you determine an appropriate amount of extraSpace and sahBlockSize?&lt;/P&gt;

&lt;P&gt;Thank you for you help, here is my bvh_builder_device.cpp file:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;// ======================================================================== //
// Copyright 2009-2018 Intel Corporation                                    //
//                                                                          //
// Licensed under the Apache License, Version 2.0 (the "License");          //
// you may not use this file except in compliance with the License.         //
// You may obtain a copy of the License at                                  //
//                                                                          //
//     &lt;A href="http://www.apache.org/licenses/LICENSE-2.0" target="_blank"&gt;http://www.apache.org/licenses/LICENSE-2.0&lt;/A&gt;                           //
//                                                                          //
// Unless required by applicable law or agreed to in writing, software      //
// distributed under the License is distributed on an "AS IS" BASIS,        //
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. //
// See the License for the specific language governing permissions and      //
// limitations under the License.                                           //
// ======================================================================== //

#include "../common/tutorial/tutorial_device.h"

namespace embree
{	
  RTCScene g_scene  = nullptr;

  /* This function is called by the builder to signal progress and to
   * report memory consumption. */
  bool memoryMonitor(void* userPtr, ssize_t bytes, bool post) {
    return true;
  }

  bool buildProgress (void* userPtr, double f) {
    return true;
  }

  void splitPrimitive (const RTCBuildPrimitive* prim, unsigned int dim, float pos, RTCBounds* lprim, RTCBounds* rprim, void* userPtr)
  {
    assert(dim &amp;lt; 3);
    assert(prim-&amp;gt;geomID == 0);
    *(BBox3fa*) lprim = *(BBox3fa*) prim;
    *(BBox3fa*) rprim = *(BBox3fa*) prim;
    (&amp;amp;lprim-&amp;gt;upper_x)[dim] = pos;
    (&amp;amp;rprim-&amp;gt;lower_x)[dim] = pos;
  }

  struct Node
  {
    virtual float sah() = 0;
	unsigned int nodeType;
  };

  struct InnerNode : public Node
  {
    BBox3fa bounds[4];
    Node* children[4];

    InnerNode() {
      bounds[0] = empty; 
	  bounds[1] = empty;
	  bounds[3] = empty; 
	  bounds[4] = empty;
	  children[0] = nullptr; 
	  children[1] = nullptr; 
	  children[2] = nullptr; 
	  children[3] = nullptr;
	  nodeType = 1;
    }

    float sah() {
      return 1.0f + (area(bounds[0])*children[0]-&amp;gt;sah() + area(bounds[1])*children[1]-&amp;gt;sah() + area(bounds[2])*children[2]-&amp;gt;sah() + area(bounds[3])*children[3]-&amp;gt;sah())/area(merge(bounds[0],bounds[1],bounds[2],bounds[3]));
    }

    static void* create (RTCThreadLocalAllocator alloc, unsigned int numChildren, void* userPtr)
    {
      //assert(numChildren == 2);
      void* ptr = rtcThreadLocalAlloc(alloc,sizeof(InnerNode),16);
      return (void*) new (ptr) InnerNode;
    }

    static void  setChildren (void* nodePtr, void** childPtr, unsigned int numChildren, void* userPtr)
    {
      //assert(numChildren == 2);
      for (size_t i=0; i&amp;lt;numChildren; i++)
        ((InnerNode*)nodePtr)-&amp;gt;children&lt;I&gt; = (Node*) childPtr&lt;I&gt;;
    }

    static void  setBounds (void* nodePtr, const RTCBounds** bounds, unsigned int numChildren, void* userPtr)
    {
      //assert(numChildren == 2);
      for (size_t i=0; i&amp;lt;numChildren; i++)
        ((InnerNode*)nodePtr)-&amp;gt;bounds&lt;I&gt; = *(const BBox3fa*) bounds&lt;I&gt;;
    }
  };
  //leaf node seems to be everything on the bottom/lowest nodes/triangles
  struct LeafNode : public Node
  {
    unsigned id;
    BBox3fa bounds;

	LeafNode(unsigned id_new, const BBox3fa&amp;amp; bounds_new) {
		id = id_new;
		bounds = bounds_new;
		nodeType = 2;
	}

    float sah() {
      return 1.0f;
    }

    static void* create (RTCThreadLocalAllocator alloc, const RTCBuildPrimitive* prims, size_t numPrims, void* userPtr)
    {
      assert(numPrims == 1);
      void* ptr = rtcThreadLocalAlloc(alloc,sizeof(LeafNode),16);
      return (void*) new (ptr) LeafNode(prims-&amp;gt;primID,*(BBox3fa*)prims);
    }
  };
  

  void build(RTCBuildQuality quality, avector&amp;lt;RTCBuildPrimitive&amp;gt;&amp;amp; prims_i, char* cfg, size_t extraSpace = 0)
  {
    rtcSetDeviceMemoryMonitorFunction(g_device,memoryMonitor,nullptr);

    RTCBVH bvh = rtcNewBVH(g_device);

    avector&amp;lt;RTCBuildPrimitive&amp;gt; prims;
    prims.reserve(prims_i.size()+extraSpace);
    prims.resize(prims_i.size());

    /* settings for BVH build */
    RTCBuildArguments arguments = rtcDefaultBuildArguments();
    arguments.byteSize = sizeof(arguments);
    arguments.buildFlags = RTC_BUILD_FLAG_DYNAMIC;
    arguments.buildQuality = quality;
    arguments.maxBranchingFactor = 4;
    arguments.maxDepth = 1024;
    arguments.sahBlockSize = 2;
    arguments.minLeafSize = 1;
    arguments.maxLeafSize = 1;
    arguments.traversalCost = 1.0f;
    arguments.intersectionCost = 1.0f;
    arguments.bvh = bvh;
    arguments.primitives = prims.data();
    arguments.primitiveCount = prims.size();
    arguments.primitiveArrayCapacity = prims.capacity();
    arguments.createNode = InnerNode::create;
    arguments.setNodeChildren = InnerNode::setChildren;
    arguments.setNodeBounds = InnerNode::setBounds;
    arguments.createLeaf = LeafNode::create;
    arguments.splitPrimitive = splitPrimitive;
    arguments.buildProgress = buildProgress;
    arguments.userPtr = nullptr;
    

	/* we recreate the prims array here, as the builders modify this array */
	for (size_t j = 0; j &amp;lt; prims.size(); j++) {
		prims&lt;J&gt; = prims_i&lt;J&gt;;
		
	}

    std::cout &amp;lt;&amp;lt; "Building BVH over " &amp;lt;&amp;lt; prims.size() &amp;lt;&amp;lt; " primitives, " &amp;lt;&amp;lt; std::flush;
    double t0 = getSeconds();
    Node* root = (Node*) rtcBuildBVH(&amp;amp;arguments);
    double t1 = getSeconds();
    const float sah = root ? root-&amp;gt;sah() : 0.0f;
    std::cout &amp;lt;&amp;lt; 1000.0f*(t1-t0) &amp;lt;&amp;lt; "ms, " &amp;lt;&amp;lt; 1E-6*double(prims.size())/(t1-t0) &amp;lt;&amp;lt; " Mprims/s, sah = " &amp;lt;&amp;lt; sah &amp;lt;&amp;lt; " [DONE]" &amp;lt;&amp;lt; std::endl;


    rtcReleaseBVH(bvh);
  }

  /* called by the C++ code for initialization */
  extern "C" void device_init (char* cfg)
  {
    /* set start render mode */
    renderTile = renderTileStandard;

    /* create random bounding boxes */
	const size_t N = 200;
    const size_t extraSpace = 1000;
    
    avector&amp;lt;RTCBuildPrimitive&amp;gt; prims;
    prims.resize(N);

	/*	Create primitives	*/
    for (size_t i=0; i&amp;lt;N; i++)
    {
      const float x = float(drand48());
      const float y = float(drand48());
      const float z = float(drand48());
      const Vec3fa p = 1000.0f*Vec3fa(x,y,z);
      const BBox3fa b = BBox3fa(p,p+Vec3fa(1.0f));

      RTCBuildPrimitive prim;
      prim.lower_x = b.lower.x;
      prim.lower_y = b.lower.y;
      prim.lower_z = b.lower.z;
      prim.geomID = 0;
      prim.upper_x = b.upper.x;
      prim.upper_y = b.upper.y;
      prim.upper_z = b.upper.z;
      prim.primID = (unsigned) i;
      prims&lt;I&gt; = prim;
    }

	/*	only want to test high quality build
    std::cout &amp;lt;&amp;lt; "Low quality BVH build:" &amp;lt;&amp;lt; std::endl;
    build(RTC_BUILD_QUALITY_LOW,prims,cfg);

    std::cout &amp;lt;&amp;lt; "Normal quality BVH build:" &amp;lt;&amp;lt; std::endl;
    build(RTC_BUILD_QUALITY_MEDIUM,prims,cfg);
	*/
    std::cout &amp;lt;&amp;lt; "High quality BVH build:" &amp;lt;&amp;lt; std::endl;
    build(RTC_BUILD_QUALITY_HIGH,prims,cfg,extraSpace);
  }

  /* task that renders a single screen tile */
  void renderTileStandard(int taskIndex, int threadIndex, int* pixels,
                          const unsigned int width,
                          const unsigned int height,
                          const float time,
                          const ISPCCamera&amp;amp; camera,
                          const int numTilesX,
                          const int numTilesY)
  {
  }

  /* task that renders a single screen tile */
  void renderTileTask(int taskIndex, int threadIndex, int* pixels,
                      const unsigned int width,
                      const unsigned int height,
                      const float time,
                      const ISPCCamera&amp;amp; camera,
                      const int numTilesX,
                      const int numTilesY)
  {
  }

  /* called by the C++ code to render */
  extern "C" void device_render (int* pixels,
                                 const int width,
                                 const int height,
                                 const float time,
                                 const ISPCCamera&amp;amp; camera)
  {
  }

  /* called by the C++ code for cleanup */
  extern "C" void device_cleanup () {
  }
}
&lt;/I&gt;&lt;/J&gt;&lt;/J&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;-Evan&lt;/P&gt;</description>
      <pubDate>Wed, 30 May 2018 15:27:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132358#M666</guid>
      <dc:creator>Waxman__Evan</dc:creator>
      <dc:date>2018-05-30T15:27:12Z</dc:date>
    </item>
    <item>
      <title>Hi Evan,</title>
      <link>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132359#M667</link>
      <description>&lt;P&gt;Hi Evan,&lt;/P&gt;

&lt;P&gt;thanks for sharing the code. First, there's a little out of bounds access here:&lt;/P&gt;

&lt;P&gt;&lt;CODE class="plain"&gt;InnerNode() {&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;&lt;CODE class="plain"&gt;...&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;&lt;CODE class="plain"&gt;bounds[4] = empty;&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;&lt;CODE class="plain"&gt;}&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;You are correct that not all inner nodes have always four valid children, therefore the sah() function should look something like this:&lt;/P&gt;

&lt;P&gt;float sah() {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; float sum_area = 0.0f;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; BBox3fa merged_bounds(empty);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; for (size_t i=0;i&amp;lt;4;i++)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (children&lt;I&gt; == nullptr) break;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; sum_area += area(bounds&lt;I&gt;)*children&lt;I&gt;-&amp;gt;sah();&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; merged_bounds.extend(bounds&lt;I&gt;);&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return 1.0f + sum_area / area(merged_bounds);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/P&gt;

&lt;P&gt;Appropriate amount of extra space should be in 40-100% range with respect to the number of input primitives. Obviously there's a performance vs. memory trade off here.&lt;/P&gt;

&lt;P&gt;sahBlockSize should be 4 for a 4-wide BVH and 8 for a 8-wide BVH.&lt;/P&gt;

&lt;P&gt;Hope this helps.&lt;/P&gt;

&lt;P&gt;C.&lt;/P&gt;</description>
      <pubDate>Thu, 31 May 2018 09:36:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132359#M667</guid>
      <dc:creator>BenthinC_Intel</dc:creator>
      <dc:date>2018-05-31T09:36:42Z</dc:date>
    </item>
    <item>
      <title>Oops, didn't catch that out</title>
      <link>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132360#M668</link>
      <description>&lt;P&gt;Oops, didn't catch that out of bounds error.&lt;BR /&gt;
	&lt;BR /&gt;
	This really helps thank you!&lt;/P&gt;

&lt;P&gt;-Evan&lt;/P&gt;</description>
      <pubDate>Thu, 31 May 2018 15:21:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Embree-Ray-Tracing-Kernels/Building-BVH4-or-BVH8/m-p/1132360#M668</guid>
      <dc:creator>Waxman__Evan</dc:creator>
      <dc:date>2018-05-31T15:21:41Z</dc:date>
    </item>
  </channel>
</rss>

