Skip navigation
Skip navigation

Structured learning for information retrieval

Petterson, James

Description

Information retrieval is the area of study concerned with the process of searching, recovering and interpreting information from large amounts of data. In this Thesis we show that many of the problems in information retrieval consist of structured learning, where the goal is to learn predictors of complex output structures, consisting of many inter-dependent variables. We then attack these problems using principled machine learning methods that are specifically suited for such scenarios. In...[Show more]

dc.contributor.authorPetterson, James
dc.date.accessioned2018-11-22T00:11:38Z
dc.date.available2018-11-22T00:11:38Z
dc.date.copyright2011
dc.identifier.otherb3088049
dc.identifier.urihttp://hdl.handle.net/1885/151782
dc.description.abstractInformation retrieval is the area of study concerned with the process of searching, recovering and interpreting information from large amounts of data. In this Thesis we show that many of the problems in information retrieval consist of structured learning, where the goal is to learn predictors of complex output structures, consisting of many inter-dependent variables. We then attack these problems using principled machine learning methods that are specifically suited for such scenarios. In the process of doing so, we develop new models, new model extensions and new algorithms that, when integrated with existing methodology, comprise a new set of tools for solving a variety of information retrieval problems. Firstly, we cover the multi-label classification problem, where we seek to predict a set of labels associated with a given object; the output in this case is structured, as the output variables are interdependent. Secondly, we focus on document ranking, where given a query and a set of documents associated with it we want to rank them according to their relevance with respect to the query; here, again, we have a structured output - a ranking of documents. Thirdly, we address topic models, where we are given a set of documents and attempt to find a compact representation of them, by learning latent topics and associating a topic distribution to each document; the output is again structured, consisting of word and topic distributions. For all the above problems, we obtain state-of-the-art solutions as attested by empirical performance in publicly available real-world datasets.
dc.format.extentxx,117 leaves.
dc.language.isoen_AU
dc.rightsAuthor retains copyright
dc.subject.lccQ325.5.P48 2011
dc.subject.lcshMachine learning
dc.subject.lcshInformation storage and retrieval systems
dc.subject.lcshData structures (Computer science)
dc.titleStructured learning for information retrieval
dc.typeThesis (PhD)
local.description.notesThesis (Ph.D.)--Australian National University
dc.date.issued2011
local.type.statusAccepted Version
local.identifier.doi10.25911/5d5149b770f81
dc.date.updated2018-11-21T13:51:06Z
dcterms.accessRightsOpen Access
local.mintdoimint
CollectionsOpen Access Theses

Download

File Description SizeFormat Image
b3088049x_Petterson_J.pdf76.75 MBAdobe PDFThumbnail


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  17 November 2022/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator