Optimization of Markov Random Fields in Computer Vision
Date
2017
Authors
Ajanthan, Thalaiyasingam
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
A large variety of computer vision tasks can be formulated using
Markov Random Fields (MRF). Except in certain special cases,
optimizing an MRF is intractable, due to a large number of
variables and complex dependencies between them. In this thesis,
we present new algorithms to perform inference in MRFs, that are
either more efficient (in terms of running time and/or memory
usage) or more effective (in terms of solution quality), than the
state-of-the-art methods.
First, we introduce a memory efficient max-flow algorithm for
multi-label submodular MRFs. In fact,
such MRFs have been shown to be optimally solvable using max-flow
based on an encoding of the labels proposed by Ishikawa, in which
each variable $X_i$ is represented by $\ell$ nodes (where $\ell$
is the number of labels) arranged in a column. However, this
method in general requires $2\,\ell^2$ edges for each pair of
neighbouring variables. This makes it inapplicable to realistic
problems with many variables and labels, due to excessive memory
requirement. By contrast, our max-flow algorithm stores $2\,\ell$
values per variable pair, requiring much less storage.
Consequently, our algorithm makes it possible to optimally solve
multi-label submodular problems involving large numbers of
variables and labels on a standard computer.
Next, we present a move-making style algorithm for multi-label
MRFs with robust non-convex priors. In particular, our algorithm
iteratively approximates the original MRF energy with an
appropriately weighted surrogate energy that is easier to
minimize. Furthermore, it guarantees that the original energy
decreases at each iteration. To this end, we consider the
scenario where the weighted surrogate energy is multi-label
submodular (i.e., it can be optimally minimized by max-flow), and
show that our algorithm then lets us handle of a large variety of
non-convex priors.
Finally, we consider the fully connected Conditional Random Field
(dense CRF) with Gaussian pairwise potentials that has proven
popular and effective for multi-class semantic segmentation.
While the energy of a dense CRF can be minimized accurately using
a Linear Programming (LP) relaxation, the state-of-the-art
algorithm is too slow to be useful in practice. To alleviate this
deficiency, we introduce an efficient LP minimization algorithm
for dense CRFs. To this end, we develop a proximal minimization
framework, where the dual of each proximal problem is optimized
via block-coordinate descent. We show that each block of
variables can be optimized in a time linear in the number of
pixels and labels. Consequently, our algorithm enables efficient
and effective optimization of dense CRFs with Gaussian pairwise
potentials.
We evaluated all our algorithms on standard energy minimization
datasets consisting of computer vision problems, such as stereo,
inpainting and semantic segmentation. The experiments at the end
of each chapter provide compelling evidence that all our
approaches are either more efficient or more effective than all
existing baselines.
Description
Keywords
Graphical Models, MRF, CRF, MEMF, Flow-encoding, IRGC, Graph-cut, Max-flow, Exit-flow, Multi-label Submodularity, Optimal Message-passing, LP-relaxation, Dense-CRF, PROX-LP
Citation
Collections
Source
Type
Thesis (PhD)
Book Title
Entity type
Access Statement
License Rights
Restricted until
Downloads
File
Description