Skip navigation
Skip navigation

A Comparative Study of Two Procedures for Calculating Likelihood Ration in Forensic Text Comparison: Multivariate Kernel Density vs. Gaussian Mixture Model-Universal Background Model

Ishihara, Shunichi

Description

We compared the performances of two procedures for calculating the likelihood ratio (LR) on the same set of text data. The first procedure was a multivariate kernel density (MVKD) procedure which has been successfully applied to various types of forensic evidence, including glass fragments, handwriting, fingerprint, voice, and texts. The second procedure was a Gaussian mixture model � universal background model (GMM-UBM), which has been commonly used in forensic voice comparison (FVC) with...[Show more]

dc.contributor.authorIshihara, Shunichi
dc.contributor.editorSarvnaz Karimi
dc.contributor.editorKarin Vespoor
dc.coverage.spatialBrisbane Australia
dc.date.accessioned2015-12-08T22:19:29Z
dc.date.createdDecember 4-6 2013
dc.identifier.isbn18347037
dc.identifier.urihttp://hdl.handle.net/1885/31581
dc.description.abstractWe compared the performances of two procedures for calculating the likelihood ratio (LR) on the same set of text data. The first procedure was a multivariate kernel density (MVKD) procedure which has been successfully applied to various types of forensic evidence, including glass fragments, handwriting, fingerprint, voice, and texts. The second procedure was a Gaussian mixture model � universal background model (GMM-UBM), which has been commonly used in forensic voice comparison (FVC) with so-called automatic features. Previous studies have applied the MVKD system to electronically-generated texts to estimate LRs, but so far no previous studies seem to have applied the GMM-UBM system to such texts. It has been reported that the latter GMM-UBM system outperforms the MVKD system in FVC. The data used for this study was chatlog messages collected from 115 authors, which were divided into test, background and development databases. Three different sample sizes of 500, 1500 and 2500 words were used to investigate how the performance is susceptible to the sample size. Results show that regardless of sample size, the performance of the GMM-UBM system was better than that of the MVKD system with respect to both validity (= accuracy) (of which the metric is the log-likelihood-ratio cost, Cllr) and reliability (= precision) (of which the metric is the 95% credible interval, CI).
dc.publisherQueensland University of Technology
dc.relation.ispartofseriesAustralasian Language Technology Association Workshop 2013
dc.rightsAuthor/s retain copyright
dc.sourceProceedings of the Workshop
dc.source.urihttp://www.alta.asn.au/events/alta2013
dc.titleA Comparative Study of Two Procedures for Calculating Likelihood Ration in Forensic Text Comparison: Multivariate Kernel Density vs. Gaussian Mixture Model-Universal Background Model
dc.typeConference paper
local.description.notesImported from ARIES
local.description.refereedYes
dc.date.issued2013
local.identifier.absfor209999 - Language, Communication and Culture not elsewhere classified
local.identifier.ariespublicationu4455832xPUB84
local.type.statusPublished Version
local.contributor.affiliationIshihara, Shunichi, College of Asia and the Pacific, ANU
local.bibliographicCitation.startpage71
local.bibliographicCitation.lastpage79
dc.date.updated2020-12-20T07:32:10Z
dcterms.accessRightsOpen Access
CollectionsANU Research Publications

Download

File Description SizeFormat Image
01_Ishihara_A_Comparative_Study_of_Two_2013.pdf1.28 MBAdobe PDF


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  17 November 2022/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator