Thanks for this nice library!
I want to use Embree to create my own BVH2 (for GPU traversal, so I don't want the BVH4/8).
I got the tutorial bvh builder running, but have some questions about it:
a) How is the memory of the nodes managed/who owns it? They are allocated with the RTCThreadLocalAllocator. Does the Embree BVH own it and destroy it when the BVH is deleted via Embree API?
b) Internally I saw that the primitives are kept in range<size_t>, but in RTCCreateLeafFunc, one only gets
const RTCBuildPrimitive* prims, size_t numPrims
Is it safe to calculate the start offset by using the prims pointer (and the original primitive pointer that was passed into the function?), because I want to store the range in my LeafNode? Is it safe to assume that the range will not change afterwards (that the primitives in the range will not change/be sorted) or do I need to copy the primitive ids at that time?
c) I guess the input primitives are modified when calling the rtcBuildBVH, right? Can I free the memory after building the BVH, or does it still reference the primitives somehow? (except for the ranges that I would store in my leafs).
d) Would it be a better alternative to create a BVH4 with normal Embree Mesh API and then convert it to a BVH2 by adding additional inner nodes, or is that a bad alternative?
a) Yes the memory allocated with rtcThreadLocalAlloc is freed when the BVH object is deleted.
b) Yes this will work with our implementation.
c) You can free the RTCBuildPrimitives array after the BVH got built.
d) Embree internally uses the exact same builder infrastructure, thus with identical settings it is possible to build the same BVH4 also through the BVH build API. However, best directly build the BVH2. The BVH4 builder essentially does an implicit BVH2 to BVH4 conversion on the fly, if you undo that you likely anyway come up with the original binary split tree of the geometry.
Great, thanks for the fast reply! I got the BVH2 generation running now and I am really impressed by the speedup compared to the single threaded NVidia example BVH code I had before. I can create a BVH with 400K triangles in 60ms in normal mode an 400ms in high quality, compared to 14 seconds (high quality) of the original BVH code.
I have one more question: Is there an easy way to find out how much of the extra space was used? Because I need an array of geom/primitive ids, so I copy those from the input primitives after building, but with high quality I don’t know how many of the extra space contains valid ids. I could scan the tree for the highest index or fill in invalid ids at the end, but it would be easier if the value would be available via the API somehow?
The high quality builder will use the entire BuildPrimitive array provided initially and there will be holes in the ID range with unused items. This unfortunately means that if you would like to compress your ID ranges you have to traverse the tree and reassign all IDs.