3D box proposals from a single monocular image of an indoor scene

dc.contributor.authorZhuo, Wei
dc.contributor.authorSalzmann, Mathieu
dc.contributor.authorHe, Xuming
dc.contributor.authorLiu, Miaomiao
dc.coverage.spatialNew Orleans, United States
dc.date.accessioned2021-01-22T00:29:42Z
dc.date.createdFebruary 2-7 2018
dc.date.issued2018
dc.date.updated2020-11-02T04:22:13Z
dc.description.abstractModern object detection methods typically rely on bounding box proposals as input. While initially popularized in the 2D case, this idea has received increasing attention for 3D bounding boxes. Nevertheless, existing 3D box proposal techniques all assume having access to depth as input, which is unfortunately not always available in practice. In this paper, we therefore introduce an approach to generating 3D box proposals from a single monocular RGB image. To this end, we develop an integrated, fully differentiable framework that inherently predicts a depth map, extracts a 3D volumetric scene representation and generates 3D object proposals. At the core of our approach lies a novel residual, differentiable truncated signed distance function module, which, accounting for the relatively low accuracy of the predicted depth map, extracts a 3D volumetric representation of the scene. Our experiments on the standard NYUv2 dataset demonstrate that our framework lets us generate high-quality 3D box proposals and that it outperforms the two-stage technique consisting of successively performing state-of-the-art depth prediction and depth-based 3D proposal generation.en_AU
dc.description.sponsorshipThe first author is supported by the Chinese Scholarship Council and CSIRO-Data61. The authors would like to thank CSIRO, for providing the GPU cluster used for all experiments in this paper. This project was also partially supported by the Program of Shanghai Subject Chief Scientist (A type) (No.15XD1502900).en_AU
dc.format.mimetypeapplication/pdfen_AU
dc.identifier.isbn978-157735800-8en_AU
dc.identifier.urihttp://hdl.handle.net/1885/219993
dc.language.isoen_AUen_AU
dc.publisherAAAI Pressen_AU
dc.relation.ispartofseries32nd AAAI Conference on Artificial Intelligence, AAAI 2018en_AU
dc.rights© Copyright 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org)en_AU
dc.source32nd AAAI Conference on Artificial Intelligence, AAAI 2018en_AU
dc.source.urihttps://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/16994/16364en_AU
dc.title3D box proposals from a single monocular image of an indoor sceneen_AU
dc.typeConference paperen_AU
dcterms.accessRightsOpen Access via publisher websiteen_AU
local.bibliographicCitation.lastpage7647en_AU
local.bibliographicCitation.startpage7639en_AU
local.contributor.affiliationZhuo, Wei, College of Engineering and Computer Science, ANUen_AU
local.contributor.affiliationSalzmann, Mathieu, EPFLen_AU
local.contributor.affiliationHe, Xuming, ShanghaiTech Universityen_AU
local.contributor.affiliationLiu, Miaomiao, College of Engineering and Computer Science, ANUen_AU
local.contributor.authoremailu5358193@anu.edu.auen_AU
local.contributor.authoruidZhuo, Wei, u5358193en_AU
local.contributor.authoruidLiu, Miaomiao, u5266426en_AU
local.description.embargo2099-12-31
local.description.notesImported from ARIESen_AU
local.description.refereedYes
local.identifier.absfor080104 - Computer Visionen_AU
local.identifier.absseo890205 - Information Processing Services (incl. Data Entry and Capture)en_AU
local.identifier.absseo970108 - Expanding Knowledge in the Information and Computing Sciencesen_AU
local.identifier.ariespublicationu3102795xPUB1609en_AU
local.identifier.scopusID2-s2.0-85060465078
local.identifier.uidSubmittedByu3102795en_AU
local.publisher.urlhttps://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/16994/16364en_AU
local.type.statusPublished Versionen_AU

Downloads