ANU Open Research Repository has been upgraded. We are still working on a few minor issues, which may result in short outages throughout the day. Please get in touch with if you experience any issues.

Stereo matching using higher-order graph cuts




Xie, Yiran

Journal Title

Journal ISSN

Volume Title



Stereo matching is one of the fundamental tasks in early vision. Unlike human brain recognizes objects and estimates the depth easily, it is difficult to design algorithms that perform well on a computer due to variations of illumination, occlusion or textureless. Like most of the early vision problems, stereo matching can be formulated as an energy minimization problem in which the optimal depth is the one with the lowest energy. And graph cuts is one of the efficient and effective minimization tools that avoids the problems of local minima. Conventional energy functions are defined on Markov Random Fields (MRFs) with a 4-connected grid structure derived from the image, however it is incapable of expressing complex relationship between group of pixels. This thesis focuses on exploring some aspects of stereo matching problems through higher-order structure and higher-order graph cuts. The first problem I address relates to the evaluation of five state-of-the-art segmentation approaches. Their different contributions to segment-based stereo matching have been quantitatively measured and analyzed. This works aim at helping researchers to choose the segmentation approach that most suitable for their stereo matching application. The second part of the thesis proposes a novel approach to dense stereo matching. This method features sub-segmentation and adopts a higher-order potential to enforce the label consistency inside segments as a soft constraint. Moreover, several successful techniques have been combined. Experiments show that this approach obtains state-of-the-art results while still keeping efficiency. In the last part of the thesis, a novel two-layer MRFs framework is presented in which stereo matching and surface boundary estimation are combined. Both properties are inferred simultaneously and globally so that they can benefit each other. This work has direct application in phosphene vision based human indoor navigation. Experiments prove that the proposed framework achieves significantly better performance than other popular methods in all resolutions.






Thesis (MPhil)

Book Title

Entity type

Access Statement

Open Access

License Rights



Restricted until