Structured learning for information retrieval

Petterson, James

doi:10.25911/5d5149b770f81

A change is coming. Click to see a sneak peek of the new Open Research Repository.

Structured learning for information retrieval

Download (76.75 MB)

link to publisher version

Altmetric Citations

Petterson, James

Description

Information retrieval is the area of study concerned with the process of searching, recovering and interpreting information from large amounts of data. In this Thesis we show that many of the problems in information retrieval consist of structured learning, where the goal is to learn predictors of complex output structures, consisting of many inter-dependent variables. We then attack these problems using principled machine learning methods that are specifically suited for such scenarios. In...[Show more] the process of doing so, we develop new models, new model extensions and new algorithms that, when integrated with existing methodology, comprise a new set of tools for solving a variety of information retrieval problems. Firstly, we cover the multi-label classification problem, where we seek to predict a set of labels associated with a given object; the output in this case is structured, as the output variables are interdependent. Secondly, we focus on document ranking, where given a query and a set of documents associated with it we want to rank them according to their relevance with respect to the query; here, again, we have a structured output - a ranking of documents. Thirdly, we address topic models, where we are given a set of documents and attempt to find a compact representation of them, by learning latent topics and associating a topic distribution to each document; the output is again structured, consisting of word and topic distributions. For all the above problems, we obtain state-of-the-art solutions as attested by empirical performance in publicly available real-world datasets.

dc.contributor.author	Petterson, James
dc.date.accessioned	2018-11-22T00:11:38Z
dc.date.available	2018-11-22T00:11:38Z
dc.date.copyright	2011
dc.identifier.other	b3088049
dc.identifier.uri	http://hdl.handle.net/1885/151782
dc.description.abstract	Information retrieval is the area of study concerned with the process of searching, recovering and interpreting information from large amounts of data. In this Thesis we show that many of the problems in information retrieval consist of structured learning, where the goal is to learn predictors of complex output structures, consisting of many inter-dependent variables. We then attack these problems using principled machine learning methods that are specifically suited for such scenarios. In the process of doing so, we develop new models, new model extensions and new algorithms that, when integrated with existing methodology, comprise a new set of tools for solving a variety of information retrieval problems. Firstly, we cover the multi-label classification problem, where we seek to predict a set of labels associated with a given object; the output in this case is structured, as the output variables are interdependent. Secondly, we focus on document ranking, where given a query and a set of documents associated with it we want to rank them according to their relevance with respect to the query; here, again, we have a structured output - a ranking of documents. Thirdly, we address topic models, where we are given a set of documents and attempt to find a compact representation of them, by learning latent topics and associating a topic distribution to each document; the output is again structured, consisting of word and topic distributions. For all the above problems, we obtain state-of-the-art solutions as attested by empirical performance in publicly available real-world datasets.
dc.format.extent	xx,117 leaves.
dc.language.iso	en_AU
dc.rights	Author retains copyright
dc.subject.lcc	Q325.5.P48 2011
dc.subject.lcsh	Machine learning
dc.subject.lcsh	Information storage and retrieval systems
dc.subject.lcsh	Data structures (Computer science)
dc.title	Structured learning for information retrieval
dc.type	Thesis (PhD)
local.description.notes	Thesis (Ph.D.)--Australian National University
dc.date.issued	2011
local.type.status	Accepted Version
local.identifier.doi	10.25911/5d5149b770f81
dc.date.updated	2018-11-21T13:51:06Z
dcterms.accessRights	Open Access
local.mintdoi	mint
Collections	Open Access Theses

Download

File	Description	Size	Format	Image
b3088049x_Petterson_J.pdf		76.75 MB	Adobe PDF

Show simple item record

Structured learning for information retrieval

Altmetric Citations

Description

Download