A Multi-modal Graphical Model for Scene Analysis
Date
2015
Authors
Taghavi Namin, Sarah
Najafi, Mohammad
Salzmann, Mathieu
Petersson, Lars
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
In this paper, we introduce a multi-modal graphical model to address the problems of semantic segmentation using 2D-3D data exhibiting extensive many-to-one correspondences. Existing methods often impose a hard correspondence between the 2D and 3D data, where the 2D and 3D corresponding regions are forced to receive identical labels. This results in performance degradation due to misalignments, 3D-2D projection errors and occlusions. We address this issue by defining a graph over the entire set of data that models soft correspondences between the two modalities. This graph encourages each region in a modality to leverage the information from its corresponding regions in the other modality to better estimate its class label. We evaluate our method on a publicly available dataset and beat the state-of-the-art. Additionally, to demonstrate the ability of our model to support multiple correspondences for objects in 3D and 2D domains, we introduce a new multi-modal dataset, which is composed of panoramic images and LIDAR data, and features a rich set of many-to-one correspondences.
Description
Keywords
Citation
Collections
Source
Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015
Type
Conference paper
Book Title
Entity type
Access Statement
License Rights
Restricted until
2037-12-31