I'm solving problem of the collision detection between two triangulated meshes (~5k vertices and ~10k faces in each) with Embree using the following 2 methods.
First, using ray-tracing: from ~1/2 vertices of the one mesh, applying rtcIntersect query to the direction of the another mesh. In this case, built-in accelerating structures are used for RTC_GEOMETRY_TYPE_TRIANGLE.
Second, using rtcCollide function. For the RTC_GEOMETRY_TYPE_USER geometry provided bounds callback for the construction of the BVH. And intersectcion between primitives(triangles) callback.(like triangle bounds function and triangle_triangle_intersection function from the tutorial "Collide").
But broad-phase doesn't give performance improvement. Second method works ~10 times slower than first(Now call rtcCollide is 0.119 seconds in averange, and ~5k calls of rtcIntersect is ~0.01 seconds ). And I'll have a couple of questions.
1) Is it possible to customize BVH for the user geometries?
2) If yes, can it give performance improvement for rtcCollide?
The two approaches are not computing the collision with the same quality, tracing rays will miss intersections, while using rtcCollide can guarantee you to find all intersections.
The slow performance of your rtcCollide implementation is likely in the callback. How do you accumulate all intersections in parallel? Do you filter out self intersections early? Running some performance analyzer (such as VTune) can help locating your issue.
Have a look at the rtcCollide tutorial for example code.