How to get better quality RGB-D pairs?

%3CLINGO-SUB%20id%3D%22lingo-sub-2149740%22%20slang%3D%22zh-CN%22%3EHow%20to%20get%20better%20quality%20RGB-D%20pairs%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2149740%22%20slang%3D%22zh-CN%22%3E%3CP%3E%3CSPAN%3EHello%20%3AD%3C%2Fimg%3E%3C%2FSPAN%3E%3CBR%20%2F%3E%3CSPAN%3EI%20want%20to%20run%20some%20algorithms%20based%20on%20RGB-D%20data%20captured%20by%20HoloLens2%2C%20thus%20%22aligned%20depth%22%20is%20needed%2C%20like%20this%20question%20-%26gt%3B%20%3CA%20href%3D%22https%3A%2F%2Fgithub.com%2Fmicrosoft%2FHoloLens2ForCV%2Fissues%2F50%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ehttps%3A%2F%2Fgithub.com%2Fmicrosoft%2FHoloLens2ForCV%2Fissues%2F50%3C%2FA%3E%20%3C%2FSPAN%3E%3CSPAN%3E%20.%20In%20short%2C%20I%20need%20RGB%20-%20Depth%20pairs%2C%20and%20they%20are%20supposed%20to%20have%20same%20resolution%20and%20same%20field%20of%20view%2C%20just%20like%20the%20%3C%2FSPAN%3E%3CSPAN%3Equestion%20says%20about%20it%2C%20%22one-to-one%20in%20pixels%22.%3C%2FSPAN%3E%3CBR%20%2F%3E%20%3CSPAN%3EI%20have%20run%20this%20script%20as%20the%20answer%20suggested%3A%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%3CSPAN%3E%3CBR%20%2F%3E%3CA%20href%3D%22https%3A%2F%2Fgithub.com%2Fmicrosoft%2FHoloLens2ForCV%2Fblob%2Fmain%2FSamples%2FStreamRecorder%2FStreamRecorderConverter%2Fsave_pclouds.py%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ehttps%3A%2F%2Fgithub.com%2Fmicrosoft%2FHoloLens2ForCV%2Fblob%2Fmain%2FSamples%2FStreamRecorder%2FStreamRecorderConverter%2Fsave_pclouds.py%3C%2FA%3E%3CBR%20%2F%3E%3CBR%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%3CP%3EBut%20the%20aligned%20RGB-D%20pair%20is%20kind%20of%20unsatisfactory.%20For%20example%2C%20here%20is%20a%20RGB%20image%20obtained.%3C%2FP%3E%3CDIV%20class%3D%22mceNonEditable%20lia-copypaste-placeholder%22%3E%26nbsp%3B%3C%2FDIV%3E%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Kernel_Function_1-1613753976063.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F256060i6F84379BF7C68DC9%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20role%3D%22button%22%20title%3D%22Kernel_Function_1-1613753976063.png%22%20alt%3D%22Kernel_Function_1-1613753976063.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%3CBR%20%2F%3ETo%20match%20the%20field%20of%20view%20of%20the%20depth%20image%20(the%20depth%20camera%20have%20a%20lower%20field%20of%20view%20than%20the%20RGB%20camera%2C%20right%3F)%2C%20there%20are%20too%20many%20black%20areas%20in%20the%20RGB%20image.%20As%20the%20RGB%20images%20have%20been%20reduced%20to%20the%20same%20resolution%20as%20depth%20maps%20(760x428-%26gt%3B320x288)%2C%20it's%20sad%20that%20so%20many%20areas%20are%20still%20wasted.%20And%20maybe%20due%20to%20the%20projection%20calculation%20in%20the%20virtual%20pinhole%20camera%20model%2C%20there%20are%20strange%20black%20curves%20in%20the%20image.%20I%20am%20not%20able%20to%20run%20my%20algorithms%20based%20on%20the%20processed%20RGB%20images%2C%20so%20upset%20T%20T%3C%2FP%3E%3CP%3EIs%20there%20any%20way%20to%20get%20better%20quality%20RGB-D%20pairs%3F%20Thanks%20for%20any%20Suggestions%20you%20may%20have!%3C%2FP%3E%3C%2FLINGO-BODY%3E
New Contributor

Hello :D
I want to run some algorithms based on RGB-D data captured by HoloLens2, thus "aligned depth" is needed, like this question -> https://github.com/microsoft/HoloLens2ForCV/issues/50 . In short, I need RGB - Depth pairs, and they are supposed to have same resolution and same field of view, just like the question says about it, "one-to-one in pixels".
I have run this script as the answer suggested:


https://github.com/microsoft/HoloLens2ForCV/blob/main/Samples/StreamRecorder/StreamRecorderConverter...

But the aligned RGB-D pair is kind of unsatisfactory. For example, here is a RGB image obtained.

 

Kernel_Function_1-1613753976063.png


To match the field of view of the depth image (the depth camera have a lower field of view than the RGB camera, right?), there are too many black areas in the RGB image. As the RGB images have been reduced to the same resolution as depth maps (760x428->320x288), it's sad that so many areas are still wasted. And maybe due to the projection calculation in the virtual pinhole camera model, there are strange black curves in the image. I am not able to run my algorithms based on the processed RGB images, so upset T T

Is there any way to get better quality RGB-D pairs? Thanks for any Suggestions you may have!

2 Replies

@Kernel_Function Hey there! Looks like you're doing some interesting stuff!

Alright, so I haven't worked with Research Mode stuff yet, so I have limited experience here. But I have worked a bit with the Kinect for Azure device and the native SDK there! There are a few things that jump out to me just reading your post.

 

The big one that caught my eye was the color/depth data overlap. On a Kinect for Azure, you'd see a lot more overlap, since the color and depth cameras are close together, facing the same direction, and have similar FOV. However, the HoloLens has a different alignment for color and depth cameras! You can see it a bit in this picture here, the depth camera is actually angled downwards. There's some HL1 images that show the same thing from the side view, so this alignment is similar in both devices.

hololens2-front-view.png


This makes sense on HoloLens, since the color camera relates to what the user is seeing: straight ahead, but the depth camera needs to work with your hands and the environment: mostly downwards. So that's why you're not seeing a whole lot of overlap there, the sensors just aren't aligned well for this use case!

I also don't remember seeing those rings before on Kinect for Azure! Shadowing yeah, but not rings. The Kinect SDK did a pretty good job overlaying color and depth, so I just don't know if this is because of the way the code you're using works, or if it's because the the camera angle is different and therefore tough to compensate for!

Anyhow, that kinda exhausts my knowledge on the subject. Computer Vision gets complicated, and I've only dabbled. If you could go into higher level details about what you're trying to achieve, it's possible I might know some tools to help there!

Hi @koujaku ! Extremely grateful for you answering my question!

Yeah, I have also noticed that the layout of the RGB and depth cameras on Hololens caused the problem. Your answer makes me realize why so many algorithms use RGB-D images from Kinect and none of them use data from HoloLens, although the depth sensor in Azure Kinect and in Hololens are the same. The information you have provided is very important, and maybe I need to change the device to obtain the data as the current problem seems kinda difficult to solve. Thanks a lot!