Skip navigation
Skip navigation

Short text authorship attribution via sequence kernels, Markov chains and author unmasking: An investigation

Sanderson, Conrad; Guenter, Simon

Description

We present an investigation of recently proposed character and word sequence kernels for the task of authorship attribution based on relatively short texts. Performance is compared with two corresponding probabilistic approaches based on Markov chains. Several configurations of the sequence kernels are studied on a relatively large dataset (50 authors), where each author covered several topics. Utilising Moffat smoothing, the two probabilistic approaches obtain similar performance, which in...[Show more]

CollectionsANU Research Publications
Date published: 2006
Type: Conference paper
URI: http://hdl.handle.net/1885/27795
Source: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

Download

File Description SizeFormat Image
01_Sanderson_Short_text_authorship_2006.pdf551.98 kBAdobe PDF    Request a copy


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  23 August 2018/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator