Skip navigation
Skip navigation

On the Role of Context at Different Scales in Scene Parsing

Najafi, Mohammad

Description

Scene parsing can be formulated as a labeling problem where each visual data element, e.g., each pixel of an image or each 3D point in a point cloud, is assigned a semantic class label. One can approach this problem by training a classifier and predicting a class label for the data elements purely based on their local properties. This approach, however, does not take into account any kind of contextual information between different elements in the image or...[Show more]

dc.contributor.authorNajafi, Mohammad
dc.date.accessioned2017-05-01T23:49:19Z
dc.date.available2017-05-01T23:49:19Z
dc.identifier.otherb43751672
dc.identifier.urihttp://hdl.handle.net/1885/116302
dc.description.abstractScene parsing can be formulated as a labeling problem where each visual data element, e.g., each pixel of an image or each 3D point in a point cloud, is assigned a semantic class label. One can approach this problem by training a classifier and predicting a class label for the data elements purely based on their local properties. This approach, however, does not take into account any kind of contextual information between different elements in the image or point cloud. For example, in an application where we are interested in labeling roadside objects, the fact that most of the utility poles are connected to some power wires can be very helpful in disambiguating them from other similar looking classes. Recurrence of certain class combinations can be also considered as a good contextual hint since they are very likely to co-occur again. These forms of high-level contextual information are often formulated using pairwise and higher-order Conditional Random Fields (CRFs). A CRF is a probabilistic graphical model that encodes the contextual relationships between the data elements in a scene. In this thesis, we study the potential of contextual information at different scales (ranges) in scene parsing problems. First, we propose a model that utilizes the local context of the scene via a pairwise CRF. Our model acquires contextual interactions between different classes by assessing their misclassification rates using only the local properties of data. In other words, no extra training is required for obtaining the class interaction information. Next, we expand the context field of view from a local range to a longer range, and make use of higher-order models to encode more complex contextual cues. More specifically, we introduce a new model to employ geometric higher-order terms in a CRF for semantic labeling of 3D point cloud data. Despite the potential of the above models at capturing the contextual cues in the scene, there are higher-level context cues that cannot be encoded via pairwise and higher-order CRFs. For instance, a vehicle is very unlikely to appear in a sea scene, or buildings are frequently observed in a street scene. Such information can be described using scene context and are modeled using global image descriptors. In particular, through an image retrieval procedure, we find images whose content is similar to that of the query image, and use them for scene parsing. Another problem of the above methods is that they rely on a computationally expensive training process for the classification using the local properties of data elements, which needs to be repeated every time the training data is modified. We address this issue by proposing a fast and efficient approach that exempts us from the cumbersome training task, by transferring the ground-truth information directly from the training data to the test data.
dc.language.isoen
dc.subjectScene Parsing
dc.subjectSemantic Labelling
dc.subjectSemantic Labeling
dc.subjectContext
dc.subjectGraphical Models
dc.subjectConditional Random Field
dc.subjectpanoramic images
dc.subjectMulti-spectral images
dc.subject3D Point Could Data
dc.subjectNon-Parametric Scene Parsing
dc.titleOn the Role of Context at Different Scales in Scene Parsing
dc.typeThesis (PhD)
local.contributor.supervisorPetersson, Lars
local.contributor.supervisorcontactLars.Petersson@data61.csiro.au
dcterms.valid2017
local.description.notesThe author deposited 2/05/17
local.type.degreeDoctor of Philosophy (PhD)
dc.date.issued2017
local.contributor.affiliationCollege of Engineering and Computer Science, The Australian National University
local.identifier.doi10.25911/5d74e220c5104
local.mintdoimint
CollectionsOpen Access Theses

Download

File Description SizeFormat Image
Najafi Thesis 2017.pdf14.92 MBAdobe PDFThumbnail


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  19 May 2020/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator