We are experiencing issues opening hdl.handle.net links on ANU campus. If you are experiencing issues, please contact the repository team repository.admin@anu.edu.au for assistance.
 

A comparative study of likelihood ratio based forensic text comparison procedures: Multivariate kernel density with lexical features vs. word N-grams vs. character N-grams

Date

2015

Authors

Ishihara, Shunichi

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

This is a comparative study to empirically investigate the performances of three different procedures for calculating authorship attribution likelihood ratios (LR). The procedures to be compared are: 1) a procedure based on multivariate kernel density (MVKD) with lexical features; 2) a procedure based on word N-grams; and 3) a procedure based on character N-grams. Furthermore, the best-performing LRs of these three procedures are fused into combined single LRs using a logistic-regression fusion, in order to investigate the extent of the improvement/deterioration that the fusion brings about. This study uses chatlog messages, which were presented as evidence to prosecute paedophiles, for testing. The numbers of word tokens used to model the authorship attribution of each message group are 500 and 1000 words. This was done to examine the effect of sample size on the performance of a system. The performance of a system is assessed with regard to its validity (= accuracy) and reliability (= precision) using the log-likelihood-ratio cost (Cllr) and 95% credible intervals (CI), respectively. While describing the different characteristics of these three procedures in their outcomes, this study demonstrates that the MVKD procedure was the best-performing procedure out of the three in terms of Cllr. This study also demonstrates that a logistic-regression fusion is useful for combining the LRs obtained from the three procedures in question, resulting in a good improvement in performance.

Description

Keywords

Citation

Source

Proceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014

Type

Conference paper

Book Title

Entity type

Access Statement

License Rights

Restricted until

2037-12-31
Back to topicon-arrow-up-solid
 
APRU
IARU
 
edX
Group of Eight Member

Acknowledgement of Country

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.


Contact ANUCopyrightDisclaimerPrivacyFreedom of Information

+61 2 6125 5111 The Australian National University, Canberra

TEQSA Provider ID: PRV12002 (Australian University) CRICOS Provider Code: 00120C ABN: 52 234 063 906