Why not have a synergy with the Intel 3D cameras and the stick.
OFFER the direct input of the Intel® RealSense™ Depth Camera D435 or 415
and have it jack straight into the compute stick.
Have the compute stick deliver the image and a list of boxes as well as distances
to the objects back to application as an optional layer.
On another subject not quite so far fetched can you please have an official
support for Tiny Yolo. Almost every demo I see you use this,
so why not give it direct highly optimized support.
Just to clarify, is there support at all for the 3D camera or multiple cameras used for 3D stereo-graphic image recognition?
Since you're asking it to plug directly in, and there is nowhere on the stick to connect it, I'm not sure if you've already found a way to use the stereo image and you're just looking for a better way, or if you have no solution for 3D imaging.
I would be interested in tinkering with stereo imaging, especially if I can get cameras with good depth of field so that there are lots of objects in focus at the same time. The idea of constructing a network that includes multiple simultaneous images from different perspectives and training it to recognize objects is really cool. The idea of supporting and existing out-of-the-box solution that Intel has created in their 3D camera is a great idea as well. Your idea of plugging it directly in might be a good idea as well, although they do have an embedded version of the device for plugging into your own custom circuit board.
So, have you developed an application for 3D image recognition? If so, what hardware are you using? I haven't seen any examples of 3D on here yet. I was thinking of throwing together some sort of robot to wander around the house looking at the dogs and the people. That would be cool.
Where can you get stock stereo images to train your network, though?
So, have you developed an application for 3D image recognition? <<
No, working on using the depth as a filter for the ROI's to subdivide the image. Using the stereo (depth map) to create an ROI then sending that to the stick for recognition. Its still a work in progress. (how much depth and how to decide that) Camera orientation when compared to objects.
Coding scaling cutting etc. So instead of the logical guessing like Yolo I want to see if I can mimic what your eyes do, connecting items based on depth as well as color.