Structured support vector machines learning and application in computer vision

Jia, Ke

Structured support vector machines learning and application in computer vision

Date

2012

Authors

Jia, Ke

Abstract

Image labeling tasks have been a long standing challenge in computer vision. In recent years, Markov /Conditional Random Fields (MRFs/CRFs) have gained popularity for the concept of "structured" learning, by defining proper pairwise potential functions to represent the spatial correlations among neighboring pixels. In this thesis, we propose an alternative discriminative approach to MRFs/CRFs by extending the max-margin principle to incorporate the spatial correlations. In particular, by explicitly enforcing the submodular condition, graph cuts is conveniently integrated as the inference engine to attain the optimal label assignment efficiently. Our approach allows learning a model with thousands of parameters, which is further facilitated by parallel computation in the learning phase. In addition to node and edge feature functions for enforcing local label consistency, our algorithm is shown to be capable to readily incorporating higher-order scene context. As image labeling focusing on prediction problems of discrete label space, known as the classification, the later part of the thesis moves on to a more gen{u00AD} eral task, the structured Support Vector Regression (SVR). Beside the unary features which are adopted in traditional SVR algorithms, the objective function in our framework considers both label information and pairwise features, helping to achieve better cross-smoothing over neighboring pixels. With the bundle method, we effectively reduce the number of constraints and alleviate the adverse effect of outliers, leading to an efficient and robust learning algorithm. Moreover, we derive the dual form of the structured SVR algorithm to fit in non-linear cases via using kernel method. Another candidate approach of matching kernels is also introduced to simplify the kernel-version algorithm. Despite classic Support Vector Machines (SVMs) are normally used for learning the feature weights in label prediction tasks, the ability of its mining hidden information from data is not limited in this area. The thesis explores on the topic of specialized structured SVMs in the fourth part, including two frameworks targeting on image matching founded stereo and image segmentation based on curve evolution. The basic approaches in these two fields have the common background of unsupervised learning of structured data, which leaves some parameters to manually tune. The specialized SVMs algorithms modify the discriminative function of max-margin learning to fit the certain application scene, and discover the best values of parameters from the ground truth. Furthermore, as the approach can automatically find out proper parameters, some of the feature representation has also been improved in order to bring in self-adaptive parameters for flexibility. Other max-margin learning related strategies like slack rescaling are also discussed in this part. We show some real-world applications using the structured Support Vector Machines (SSVMs), consisting of classification, regression and specialized SSVMs. The approaches perform competitively comparing to the state-of-the{u00AD}art image processing methods. Finally, we discuss some future directions in the field of structured max{u00AD} margin learning, such as efficient MRFs inference engine, joint feature spaces and other unsupervised approaches potentially being learnt.