Recovering missing information when projecting 3D points or unprojecting image pixels

dc.contributor.authorChen, Wayne
dc.date.accessioned2024-03-05T01:00:54Z
dc.date.available2024-03-05T01:00:54Z
dc.date.issued2024
dc.description.abstractComputer vision aims to bridge the divide between 2D and 3D spaces. With the significant advance- ments in computational resources and deep learning techniques, neural networks have become the cornerstone for solving computer vision tasks. As the training of a neural network is data-driven, both input and ground truth data play pivotal roles in the network training process. However, 2D data is usually dense but involves a projection operation that loses 3D information; 3D data is often sparse due to sensor limitations. Addressing this challenge, our research focuses on the recovery of missing information when pro- jecting 3D points or unprojecting image pixels, exploring this problem across three different tasks: novel view synthesis, uncertainty-aware monocular depth estimation, and latent space analyses for the deepSDF model. Novel view synthesis from sparse coloured point clouds aims to generate dense RGB images given a sparse XYZRGB input Uncertainty-aware Monocular Depth Estimation (MDE) targets the generation of dense depth es- timation given a dense RGB input and sparse depth ground truth. We propose a novel network with an encoder-decoder structure and a novel loss function that enables joint training of depth and uncertainty estimation. This model competes closely with state-of-the-art solutions on depth estimation evaluation metrics and outperforms them on uncertainty estimation. The latent space analysis for the deepSDF model explores the connections among latent represent- ations of different 3D models. Our experiments reveal that these latent codes are not independent; latent codes generated from linear interpolation between each pair of latent codes represent the transformation from one model to another. Our findings confirm the existence and impact of sparsity within input data. However, our proposed methods demonstrate not only how to overcome these challenges but also how to evaluate their impact on the accuracy of the generated results. This work contributes significantly to enhancing the accuracy and reliability of models tackling data sparsity in the field of computer vision.
dc.identifier.urihttp://hdl.handle.net/1885/315712
dc.language.isoen_AU
dc.titleRecovering missing information when projecting 3D points or unprojecting image pixels
dc.typeThesis (MPhil)
local.contributor.authoremailu5152653@anu.edu.au
local.contributor.supervisorZhang, Jing
local.contributor.supervisorcontactu1031665@anu.edu.au
local.identifier.doi10.25911/4B7N-7X21
local.mintdoimint
local.thesisANUonly.authordcd9c621-5c3d-462a-b3c5-bf249d9297db
local.thesisANUonly.key5b2a98c7-353b-7e06-ff1b-617d309e5528
local.thesisANUonly.title000000029187_TC_1

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Chen_Thesis_Recovering missing information when projecting 3D points or unprojecting image pixels_2024.pdf
Size:
10.3 MB
Format:
Adobe Portable Document Format
Description:
Thesis Material
Back to topicon-arrow-up-solid
 
APRU
IARU
 
edX
Group of Eight Member

Acknowledgement of Country

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.


Contact ANUCopyrightDisclaimerPrivacyFreedom of Information

+61 2 6125 5111 The Australian National University, Canberra

TEQSA Provider ID: PRV12002 (Australian University) CRICOS Provider Code: 00120C ABN: 52 234 063 906