Re: measuring length of object using depth and rgb frames

AVorn · ‎03-21-2018

Hello, i am wondering what is the best way to measure length or width of objects, using aligned frames of rgb and depth (16 bit grayscale) which i took with sr300 (i have the device so i know its parameters) and that saved on the disk.

I assume that first i need to coordinate the pixels into (x,y,z) and calculate the distance of the object from the camera using the depth frame, but i am not quite sure how to do it.

I know that the sdk offers such methods, but i have to implement them by myself, because i want to do the measuring on frames (tiff format) which i already saved on my disk.

Thank you very much

jb455 · ‎03-21-2018

This thread may help with the first part of your problem, getting the frames in the SDK and aligned (which will be probably easier than implementing it all by yourself): https://github.com/IntelRealSense/librealsense/issues/1274 upload color and depth images and then align them · Issue # 1274 · IntelRealSense/librealsense · GitHub

Once you have the depth and colour aligned you can obtain the point cloud (xyz points), then just calculate the distance between 2 points using Pythagoras.

AVorn · ‎03-21-2018

thank you very much for your quick reply, my rgb and depth frames are aligned (i used pyrealsesne2 as in the thread) so i don't need to implement it.

for example, pixel [24,56] with rgb value [100,150,125] is mapped to depth fame`s pixel [24,56] with depth value [1663]

now that i have those aligned frames i am wondering how to use those images to obtain the point cloud using python.

jb455 · ‎03-21-2018

Ah right, if you have aligned depth and the intrinsics of the colour camera you can calculate the pointcloud using https://github.com/IntelRealSense/librealsense/blob/master/include/librealsense2/rsutil.h# L46 rs2_deproject_pixel_to_point, the process of which is explained here: https://threeconstants.wordpress.com/tag/pinhole-camera-model/ Pinhole camera model | Three Constants.

AVorn · ‎03-25-2018

thank you very much

MartyG · ‎07-30-2018

Hi alover27, you asked a while ago about creating point clouds in Python. There is a tutorial for this now.

https://hub.mybinder.org/user/dorodnic-binder_test-b661uhan/notebooks/pointcloud.ipynb https://hub.mybinder.org/user/dorodnic-binder_test-b661uhan/notebooks/pointcloud.ipynb

PMalt3 · ‎07-07-2019

Hello, MartyG! Looks like provided link is unresponsible. Can you update it?

Thanks!

MartyG · ‎07-07-2019

Hi PMalt3,

This tutorial is available here:

https://github.com/dorodnic/binder_test/blob/master/pointcloud.ipynb

PMalt3 · ‎07-07-2019

Thanks!

matanster4 · ‎01-26-2021

As some of the links in this thread are obsolete, I'd like to check my understanding on this topic while refreshing the answer to this key question.

Please correct me if I'm wrong. I would assume that to get the distance between two points based on the distance of each point from the sensor (given by the depth sensor) and the projected X and Y coordinates given in the rgb data, we need the focal length of the rgb camera to enable the basic algebra and trigonometry yielding the physical x and y values for each point.

Is it the case that we can even avoid much of the explicitly by calling the API endpoint for deprojecting a pixel to 3D space as well as alternatively by generating an point cloud of the scene?

Underlying that, I would also assume that the both the RGB image and depth map are adjusted by the API to mimic a single lens even though the device obviously uses multiple optical components (lenses) to calculate them. So the focal length must be a virtual focal length and not directly the focal length of the physical optics.

I Would be thankful to learn from you whether I got it right or whether this is not entirely the case, prior to making an order for the device.

Thanks!!!

matanster4 · ‎01-26-2021

I think that the most directly related documentation may seem to be here, where there is a reference to this example code that aligns the depth and RGB images. My take from there is that there is API for getting the 3D coordinates of the entire scene as a "Point Cloud", after explicitly using the alignment part of the API. And that deprojection for just a single 2D coordinate is also an option somewhere in the API.

I'm not sure why stereo disparity is mentioned for the D400 series there, as I'd expect that aspect to be handled internally by the SDK.