How do I interpret the depth map in MATLAB?

Bharath_L_ · ‎10-26-2015

I was able to view the image from the depth stream in MATLAB (using the webcam from the Hardware Support Package). However, it does not look the same way as it does in the Camera Explorer.

What I see from MATLAB -

I have also attached a depth.mat that contains the image in the variable "D".

The image is returned as a 3 dimensional array of uint8. I assumed that the depth stream is a larger number that is broken in bits in each plane and tried bitshifting each plane and adding it to the next while taking care of the datatypes. Then displayed it using imagesc, but did not get a proper depth image.

How do I properly interpret this image? Or, is there an alternate way to capture images in MATLAB?

samontab · ‎10-26-2015

Hi,

There are 3 different depth formats:

https://software.intel.com/sites/landingpage/realsense/camera-sdk/v1.1/documentation/html/index.html?pixelformat_pxcimage.html

a) PIXEL_FORMAT_DEPTH
b) PIXEL_FORMAT_DEPTH_RAW
c) PIXEL_FORMAT_DEPTH_F32

The first two are 16bit unsigned ints, and the third one is composed of 32bit floating point numbers.

a) and c) are measured in mm, and b) is measured in micrometers and is device specific. You can get the resolution of your camera with QueryDephUnit:

https://software.intel.com/sites/landingpage/realsense/camera-sdk/v1.1/documentation/html/querydepthunit_device_pxccapture.html

In your case, I would guess it is using format b). You would have to transform those 3 channels into one single 16bit uint channel.

Bharath_L_ · ‎10-27-2015

Hi,

I have a 3D variable D

Now,

D(:,:,1)-D(:,:,2)

gives be a blank image. So I tried -

imagesc(bitshift(uint16(D(:,:,1)),7)+bitshift(uint16(D(:,:,3)),0))

and got

Then I tried -

imagesc(bitshift(uint16(D(:,:,3)),7)+bitshift(uint16(D(:,:,1)),0))

and got

I flipped the bits (0x01 -> 0x80) in permutation and did the same - The images were similar with minor texture difference, nothing like a depth map. Is there more to recreating the uint16 shifted image? Or, has some data from the RAW stream been lost when being brought into MATLABL?

samontab · ‎10-27-2015

It looks like you are reading a 16 bit image as an 8 bit image, that's why you see the "waves"

Also, remember that the monitor can only display 8 bit images, so after you read it as a 16 bit image you need to scale it into a 8 bits image to display it correctly.

Bharath_L_ · ‎10-27-2015

Thank you for the tip...

Here is what I did to fix it

>> D1=uint16(D);

>> Da=(bitshift(D1(:,:,1),7)+bitshift(D1(:,:,3),0));
>> Db=double(Da)/65535;
>> imshow(Db)
>> imagesc(Db)
>> max(Db(:))

ans =

    0.4375

>> min(Db(:))

ans =

     0

>> 

>> Dc=(bitshift(D1(:,:,3),7)+bitshift(D1(:,:,1),0));
>> Dd=double(Dc)/65535;
>> imshow(Dd)
>> imagesc(Dd)
>> max(Dd(:))

ans =

    0.4981

>> min(Dd(:))

ans =

     0

Db and Dd is in the range of 0:1 and the images still looked like waves. I am beginning to think that I am not receiving all the data from the webcam hardware support package :-( or have I done something wrong?

Thank you for all the tips so far... :-)

Matthias_H_Intel · ‎11-04-2015

@ "The image is returned as a 3 dimensional array of uint8"

Possibly that's the issue. How is the data transformed to uint8? does it take only the least significant byte of the original uint16? Once it's uint8 a cast to uint16 surely doesn't help to get back lost resolution

@ "D(:,:,1)-D(:,:,2) gives a blank image"

how did you test it's "blank"?

@ "imagesc(bitshift(uint16(D(:,:,1)),7)+bitshift(uint16(D(:,:,3)),0))"

doesn't make sense to me as those arrays are on a complete different scale (at least the original data - maybe your transform to the Matlab array already brought them to the same scale?)

Bharath_L_ · ‎11-04-2015

matthias-hahn (Intel) wrote:

@ "The image is returned as a 3 dimensional array of uint8"

Possibly that's the issue. How is the data transformed to uint8? does it take only the least significant byte of the original uint16? Once it's uint8 a cast to uint16 surely doesn't help to get back lost resolution

I do not know. The MATLAB hardware support package for webcam did it. I assumed that a single stream is split into 3 channels of 8 bit each- that is the basis of all my attempts.

matthias-hahn (Intel) wrote:

@ "D(:,:,1)-D(:,:,2) gives a blank image"

how did you test it's "blank"?

R=D(:,:,1)-D(:,:,2)

max(abs(R(:)))

result is 0 - this makes me think that after all 16 bits are expressed as 2 channels, the third one is repeated.

matthias-hahn (Intel) wrote:

@ "imagesc(bitshift(uint16(D(:,:,1)),7)+bitshift(uint16(D(:,:,3)),0))"

doesn't make sense to me as those arrays are on a complete different scale (at least the original data - maybe your transform to the Matlab array already brought them to the same scale?)

What should I expect from a single frame from the webcam? I thought that I may have to read two consecutive frames to build the depth map, but that is a wiered idea too. Everything I have done is based on guessing how matlab processed the depth map.

Thank you for the response :-)

Matthias_H_Intel · ‎11-04-2015

samontab wrote:

Hi,

There are 3 different depth formats:

https://software.intel.com/sites/landingpage/realsense/camera-sdk/v1.1/d...

a) PIXEL_FORMAT_DEPTH
b) PIXEL_FORMAT_DEPTH_RAW
c) PIXEL_FORMAT_DEPTH_F32

The first two are 16bit unsigned ints, and the third one is composed of 32bit floating point numbers.

a) and c) are measured in mm, and b) is measured in micrometers and is device specific. You can get the resolution of your camera with QueryDephUnit:

https://software.intel.com/sites/landingpage/realsense/camera-sdk/v1.1/d...

In your case, I would guess it is using format b). You would have to transform those 3 channels into one single 16bit uint channel.

@"you would have to transform those 3 channels into one single 16bit uint channel" - why? wouldn't 1 channel be sufficient?

samontab · ‎11-05-2015

matthias-hahn (Intel) wrote:

why? wouldn't 1 channel be sufficient?

I am thinking that maybe the original 16 bits are being encoded into a 24 bits RGB image.

If that's the case, you could reconstruct the original image by picking the correct 16 bits of each 24 bit pixel of the RGB image.