LoCo: Learning 3D Location-Consistent Image Features with a Memory-Efficient Ranking Loss

dc.contributor.authorKloepfer, Dominik A.en
dc.contributor.authorHenriques, Joãoen
dc.contributor.authorCampbell, Dylanen
dc.date.accessioned2026-03-01T17:41:12Z
dc.date.available2026-03-01T17:41:12Z
dc.date.issued2024en
dc.description.abstractImage feature extractors are rendered substantially more useful if different views of the same 3D location yield similar features while still being distinct from other locations. A feature extractor that achieves this goal even under significant viewpoint changes must recognise not just semantic categories in a scene, but also understand how different objects relate to each other in three dimensions. Existing work addresses this task by posing it as a patch retrieval problem, training the extracted features to facilitate retrieval of all image patches that project from the same 3D location. However, this approach uses a loss formulation that requires substantial memory and computation resources, limiting its applicability for large-scale training. We present a method for memory-efficient learning of location-consistent features that reformulates and approximates the smooth average precision objective. This novel loss function enables improvements in memory efficiency by three orders of magnitude, mitigating a key bottleneck of previous methods and allowing much larger models to be trained with the same computational resources. We showcase the improved location consistency of our trained feature extractor directly on a multi-view consistency task, as well as the downstream task of scene-stable panoptic segmentation, significantly outperforming previous state-of-the-art.en
dc.description.sponsorshipThe authors acknowledge the generous support of the Royal Academy of Engineering (RF\201819\18\163), and EPSRC (VisualAI, EP/T028572/1).en
dc.description.statusPeer-revieweden
dc.format.extent13en
dc.identifier.issn1049-5258en
dc.identifier.otherORCID:/0000-0002-4717-6850/work/206893850en
dc.identifier.scopus105000522572en
dc.identifier.urihttps://hdl.handle.net/1885/733806845
dc.language.isoenen
dc.relation.ispartofseries38th Conference on Neural Information Processing Systems, NeurIPS 2024en
dc.rightsPublisher Copyright: © 2024 Neural information processing systems foundation. All rights reserved.en
dc.sourceAdvances in Neural Information Processing Systemsen
dc.titleLoCo: Learning 3D Location-Consistent Image Features with a Memory-Efficient Ranking Lossen
dc.typeConference paperen
dspace.entity.typePublicationen
local.contributor.affiliationKloepfer, Dominik A.; University of Oxforden
local.contributor.affiliationHenriques, João; University of Oxforden
local.contributor.affiliationCampbell, Dylan; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.identifier.citationvolume37en
local.identifier.puree9280ee6-e091-43aa-8681-de714f549571en
local.identifier.urlhttps://www.scopus.com/pages/publications/105000522572en
local.type.statusPublisheden

Downloads