Recovering missing information when projecting 3D points or unprojecting image pixels
Date
2024
Authors
Chen, Wayne
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Computer vision aims to bridge the divide between 2D and 3D spaces. With the significant advance-
ments in computational resources and deep learning techniques, neural networks have become the
cornerstone for solving computer vision tasks. As the training of a neural network is data-driven,
both input and ground truth data play pivotal roles in the network training process. However, 2D
data is usually dense but involves a projection operation that loses 3D information; 3D data is often
sparse due to sensor limitations.
Addressing this challenge, our research focuses on the recovery of missing information when pro-
jecting 3D points or unprojecting image pixels, exploring this problem across three different tasks:
novel view synthesis, uncertainty-aware monocular depth estimation, and latent space analyses for
the deepSDF model.
Novel view synthesis from sparse coloured point clouds aims to generate dense RGB images given a
sparse XYZRGB input
Uncertainty-aware Monocular Depth Estimation (MDE) targets the generation of dense depth es-
timation given a dense RGB input and sparse depth ground truth. We propose a novel network
with an encoder-decoder structure and a novel loss function that enables joint training of depth
and uncertainty estimation. This model competes closely with state-of-the-art solutions on depth
estimation evaluation metrics and outperforms them on uncertainty estimation.
The latent space analysis for the deepSDF model explores the connections among latent represent-
ations of different 3D models. Our experiments reveal that these latent codes are not independent;
latent codes generated from linear interpolation between each pair of latent codes represent the
transformation from one model to another.
Our findings confirm the existence and impact of sparsity within input data. However, our proposed
methods demonstrate not only how to overcome these challenges but also how to evaluate their
impact on the accuracy of the generated results. This work contributes significantly to enhancing the
accuracy and reliability of models tackling data sparsity in the field of computer vision.
Description
Keywords
Citation
Collections
Source
Type
Thesis (MPhil)
Book Title
Entity type
Access Statement
License Rights
Restricted until
Downloads
File
Description