Sanderson, Conrad; Guenter, Simon
We present an investigation of recently proposed character and word sequence kernels for the task of authorship attribution based on relatively short texts. Performance is compared with two corresponding probabilistic approaches based on Markov chains. Several configurations of the sequence kernels are studied on a relatively large dataset (50 authors), where each author covered several topics. Utilising Moffat smoothing, the two probabilistic approaches obtain similar performance, which in...[Show more]
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.