Nonparametric Bayesian Topic Modelling with Auxiliary Data

dc.contributor.authorLim, Kar Wai
dc.date.accessioned2016-08-09T02:14:49Z
dc.date.available2016-08-09T02:14:49Z
dc.date.issued2016
dc.description.abstractThe intent of this dissertation in computer science is to study topic models for text analytics. The first objective of this dissertation is to incorporate auxiliary information present in text corpora to improve topic modelling for natural language processing (NLP) applications. The second objective of this dissertation is to extend existing topic models to employ state-of-the-art nonparametric Bayesian techniques for better modelling of text data. In particular, this dissertation focusses on: - incorporating hashtags, mentions, emoticons, and target-opinion dependency present in tweets, together with an external sentiment lexicon, to perform opinion mining or sentiment analysis on products and services; - leveraging abstracts, titles, authors, keywords, categorical labels, and the citation network to perform bibliographic analysis on research publications, using a supervised or semi-supervised topic model; and - employing the hierarchical Pitman-Yor process (HPYP) and the Gaussian process (GP) to jointly model text, hashtags, authors, and the follower network in tweets for corpora exploration and summarisation. In addition, we provide a framework for implementing arbitrary HPYP topic models to ease the development of our proposed topic models, made possible by modularising the Pitman-Yor processes. Through extensive experiments and qualitative assessment, we find that topic models fit better to the data as we utilise more auxiliary information and by employing the Bayesian nonparametric method.en_AU
dc.identifier.otherb39905962
dc.identifier.urihttp://hdl.handle.net/1885/107151
dc.language.isoenen_AU
dc.subjectBayesian nonparametricen_AU
dc.subjecttopic modellingen_AU
dc.subjecthierarchical Pitman-Yor processen_AU
dc.titleNonparametric Bayesian Topic Modelling with Auxiliary Dataen_AU
dc.typeThesis (PhD)en_AU
dcterms.valid2016en_AU
local.contributor.affiliationResearch School of Computer Science, College of Engineering and Computer Science, The Australian National Universityen_AU
local.contributor.authoremailkarwai.lim@anu.edu.auen_AU
local.contributor.supervisorBuntine, Wray
local.contributor.supervisorcontactwray.buntine@monash.eduen_AU
local.identifier.doi10.25911/5d778a7858d02
local.mintdoimint
local.type.degreeDoctor of Philosophy (PhD)en_AU

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Lim Thesis 2016.pdf
Size:
2.45 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
884 B
Format:
Item-specific license agreed upon to submission
Description: