Supersampling of Data from Structured-light Scanner with Deep Learning
Abstract: This paper focuses on increasing the resolution of depth maps obtained from 3D cameras using structured light technology. Two deep learning models FDSR and DKN are modified to work with high-resolution data, and data pre-processing techniques are implemented for stable training. The models are trained on our custom dataset of 1200 3D scans. The resulting high-resolution depth maps are evaluated using qualitative and quantitative metrics. The approach for depth map upsampling offers benefits such as reducing the processing time of a pipeline by first downsampling a high-resolution depth map, performing various processing steps at the lower resolution and upsampling the resulting depth map or increasing the resolution of a point cloud captured in lower resolution by a cheaper device. The experiments demonstrate that the FDSR model excels in terms of faster processing time, making it a suitable choice for applications where speed is crucial. On the other hand, the DKN model provides results with higher precision, making it more suitable for applications that prioritize accuracy.
- Hole-filling of realsense depth images using a color edge map. IEEE Access, 8:53901–53914, 2020.
- Edge-preserving down/upsampling for depth map compression in high-efficiency video coding. Optical Engineering, 52(7):071509, 2013.
- Towards fast and accurate real-world depth super-resolution: Benchmark dataset and baseline. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), page 9229–9238, 2021.
- Depth map super-resolution by deep multi-scale guidance. In Proceedings of European Conference on Computer Vision (ECCV), page 353–369, 2016.
- Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9):850–863, 1993.
- Deformable kernel networks for joint image filtering. International Journal of Computer Vision, 129(2):579–600, February 2021.
- Microsoft. Microsoft kinect. Retrieved April 17, 2023, from https://learn.microsoft.com/en-us/windows/apps/design/devices/kinect-for-windows.
- NVIDIA. Nvidia dlss 2.0: A big leap in ai rendering. Retrieved April 17, 2023, from https://www.nvidia.com/engb/geforce/news/nvidia-dlss-2-0-a-big-leap-in-ai-rendering/.
- Inferring super-resolution depth from a moving light-source enhanced rgb-d sensor: A variational approach. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1–10, 2020.
- Daniel Scharstein. Middlebury stereo datasets. Retrieved April 17, 2023, from https://vision.middlebury.edu/stereo/data/.
- Indoor segmentation and support inference from rgbd images. In ECCV, 2012.
- Channel attention based iterative residual learning for depth map super-resolution. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5630–5639, Los Alamitos, CA, USA, 6 2020. IEEE Computer Society.
- Discrete cosine transform network for guided depth map super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), page 5697–5707, 2022.
- High-resolution depth maps imaging via attention-based hierarchical multi-modal fusion. Trans. Img. Proc., 31:648–663, 1 2022.
- Open3D: A modern library for 3D data processing. arXiv:1801.09847, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.