Skip navigation
Skip navigation

Hash kernels and structured learning

Shi, Qinfeng

Description

Vast amounts of data being generated, how to process massive data remains a challenge for machine learning algorithms. We propose hash kernels to facilitate efficient kernels which can deal with massive multi-class problems. We show a principled way to compute the kernel matrix for data streams and sparse feature spaces. We further generalise it via sampling to graphs. Later we exploit the connection between hash kernels with compressed sensing, and apply hashing to face recognition which...[Show more]

dc.contributor.authorShi, Qinfeng
dc.date.accessioned2018-11-22T00:04:00Z
dc.date.available2018-11-22T00:04:00Z
dc.date.copyright2011
dc.identifier.otherb2569971
dc.identifier.urihttp://hdl.handle.net/1885/149746
dc.description.abstractVast amounts of data being generated, how to process massive data remains a challenge for machine learning algorithms. We propose hash kernels to facilitate efficient kernels which can deal with massive multi-class problems. We show a principled way to compute the kernel matrix for data streams and sparse feature spaces. We further generalise it via sampling to graphs. Later we exploit the connection between hash kernels with compressed sensing, and apply hashing to face recognition which significantly speeds up over the state-of-the-art with competitive accuracy. And we give a recovery rate on the sparse representation and a bounded recognition rate. As hash kernels can deal with data with structures in the input such as graphs and face images, the second part of the thesis moves on to an even more challenging task - dealing with data with structures in the output. Recent advances in machine learning exploit the dependency among data output, hence dealing with complex, structured data becomes possible. We study the most popular structured learning algorithms and categorise them into two categories - probabilistic approaches and Max Margin approaches. We show the connections of different algorithms, reformulate them in the empirical risk minimisation framework, and compare their advantages and disadvantages, which help choose suitable algorithms according to the characteristics of the application. We have made practical and theoretical contributions in this thesis. We show some real-world applications using structured learning as follows: a) We propose a novel approach for automatic paragraph segmentation, namely training Semi-Markov models discriminatively using a Max-Margin method. This method allows us to model the sequential nature of the problem and to incorporate features of a whole paragraph, such as paragraph coherence which cannot be used in previous models. b) We jointly segment and recognise actions in video sequences with a discriminative semi-Markov model framework, which incorporates features that capture the characteristics on boundary frames, action segments and neighbouring action segments. A Viterbi-like algorithm is devised to help efficiently solve the induced optimisation problem. c) We propose a novel hybrid loss of Conditional Random Fields (CRFs) and Support Vector Machines (SVMs). We apply the hybrid loss to various applications such as Text chunking, Named Entity Recognition and Joint Image Categorisation. We have made the following theoretical contributions: a) We study the recent advance in PAC-Bayes bounds, and apply it to structured learning. b) We propose a more refined notion of Fisher consistency, namely Conditional Fisher Consistency for Classification (CFCC), that conditions on the knowledge of the true distribution of class labels. c) We show that the hybrid loss has the advantages of both CRFs and SVMs - it is consistent and has a tight PAC-Bayes bound which shrinks as the margin increases. d) We also introduce Probabilistic margins which take the label distribution into account. And we show that many existing algorithms can be viewed as special cases of the new margin concept which may help understand existing algorithms as well as design new algorithms. At last, we discuss some future directions such as tightening PAC-Bayes bounds, adaptive hybrid losses and graphical model inference via Compressed Sensing.
dc.format.extentxxiv, 137 leaves.
dc.language.isoen_AU
dc.rightsAuthor retains copyright
dc.subject.lccQ325.75.S55 2011
dc.subject.lcshSupervised learning (Machine learning)
dc.subject.lcshKernel functions
dc.titleHash kernels and structured learning
dc.typeThesis (PhD)
local.description.notesThesis (Ph.D.)--Australian National University
dc.date.issued2011
local.type.statusAccepted Version
local.contributor.affiliationAustralian National University. Research School of Information Sciences and Engineering
local.identifier.doi10.25911/5d626d728e2fa
dc.date.updated2018-11-20T02:29:21Z
dcterms.accessRightsOpen Access
local.mintdoimint
CollectionsOpen Access Theses

Download

File Description SizeFormat Image
b25699714_Shi_Qinfeng.pdf32.3 MBAdobe PDFThumbnail


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  17 November 2022/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator