A Comparative Study of Two Procedures for Calculating Likelihood Ration in Forensic Text Comparison: Multivariate Kernel Density vs. Gaussian Mixture Model-Universal Background Model
Description
We compared the performances of two procedures for calculating the likelihood ratio (LR) on the same set of text data. The first procedure was a multivariate kernel density (MVKD) procedure which has been successfully applied to various types of forensic evidence, including glass fragments, handwriting, fingerprint, voice, and texts. The second procedure was a Gaussian mixture model � universal background model (GMM-UBM), which has been commonly used in forensic voice comparison (FVC) with...[Show more]
dc.contributor.author | Ishihara, Shunichi | |
---|---|---|
dc.contributor.editor | Sarvnaz Karimi | |
dc.contributor.editor | Karin Vespoor | |
dc.coverage.spatial | Brisbane Australia | |
dc.date.accessioned | 2015-12-08T22:19:29Z | |
dc.date.created | December 4-6 2013 | |
dc.identifier.isbn | 18347037 | |
dc.identifier.uri | http://hdl.handle.net/1885/31581 | |
dc.description.abstract | We compared the performances of two procedures for calculating the likelihood ratio (LR) on the same set of text data. The first procedure was a multivariate kernel density (MVKD) procedure which has been successfully applied to various types of forensic evidence, including glass fragments, handwriting, fingerprint, voice, and texts. The second procedure was a Gaussian mixture model � universal background model (GMM-UBM), which has been commonly used in forensic voice comparison (FVC) with so-called automatic features. Previous studies have applied the MVKD system to electronically-generated texts to estimate LRs, but so far no previous studies seem to have applied the GMM-UBM system to such texts. It has been reported that the latter GMM-UBM system outperforms the MVKD system in FVC. The data used for this study was chatlog messages collected from 115 authors, which were divided into test, background and development databases. Three different sample sizes of 500, 1500 and 2500 words were used to investigate how the performance is susceptible to the sample size. Results show that regardless of sample size, the performance of the GMM-UBM system was better than that of the MVKD system with respect to both validity (= accuracy) (of which the metric is the log-likelihood-ratio cost, Cllr) and reliability (= precision) (of which the metric is the 95% credible interval, CI). | |
dc.publisher | Queensland University of Technology | |
dc.relation.ispartofseries | Australasian Language Technology Association Workshop 2013 | |
dc.rights | Author/s retain copyright | |
dc.source | Proceedings of the Workshop | |
dc.source.uri | http://www.alta.asn.au/events/alta2013 | |
dc.title | A Comparative Study of Two Procedures for Calculating Likelihood Ration in Forensic Text Comparison: Multivariate Kernel Density vs. Gaussian Mixture Model-Universal Background Model | |
dc.type | Conference paper | |
local.description.notes | Imported from ARIES | |
local.description.refereed | Yes | |
dc.date.issued | 2013 | |
local.identifier.absfor | 209999 - Language, Communication and Culture not elsewhere classified | |
local.identifier.ariespublication | u4455832xPUB84 | |
local.type.status | Published Version | |
local.contributor.affiliation | Ishihara, Shunichi, College of Asia and the Pacific, ANU | |
local.bibliographicCitation.startpage | 71 | |
local.bibliographicCitation.lastpage | 79 | |
dc.date.updated | 2020-12-20T07:32:10Z | |
dcterms.accessRights | Open Access | |
Collections | ANU Research Publications |
Download
File | Description | Size | Format | Image |
---|---|---|---|---|
01_Ishihara_A_Comparative_Study_of_Two_2013.pdf | 1.28 MB | Adobe PDF |
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.
Updated: 17 November 2022/ Responsible Officer: University Librarian/ Page Contact: Library Systems & Web Coordinator