The influence of background data size on the performance of a score-based likelihood ratio system: A case of forensic text comparison

dc.contributor.authorIshihara, Shunichi
dc.coverage.spatialVirtual
dc.date.accessioned2024-04-23T23:11:43Z
dc.date.available2024-04-23T23:11:43Z
dc.date.created2020
dc.date.issued2020
dc.date.updated2022-12-25T07:16:58Z
dc.description.abstractThis study investigates the robustness and stability of a likelihood ratio–based (LR-based) forensic text comparison (FTC) system against the size of background population data. Focus is centred on a score-based approach for estimating authorship LRs. Each document is represented with a bag-of-words model, and the Cosine distance is used as the score-generating function. A set of population data that differed in the number of scores was synthesised 20 times using the Monte-Carol simulation technique. The FTC system’s performance with different population sizes was evaluated by a gradient metric of the log–LR cost (Cllr). The experimental results revealed two outcomes: 1) that the score-based approach is rather robust against a small population size—in that, with the scores obtained from the 40 60 authors in the database, the stability and the performance of the system become fairly comparable to the system with a maximum number of authors (720); and 2) that poor performance in terms of Cllr, which occurred because of limited background population data, is largely due to poor calibration. The results also indicated that the score-based approach is more robust against data scarcity than the feature-based approach; however, this finding obliges further study.en_AU
dc.format.mimetypeapplication/pdfen_AU
dc.identifier.urihttp://hdl.handle.net/1885/317053
dc.language.isoen_AUen_AU
dc.provenanceMaterials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.en_AU
dc.publisherAustralasian Language Technology Associationen_AU
dc.relation.ispartofseriesThe Australasian Language Technology Association Workshop 2020en_AU
dc.rightsACL materials are Copyright © 1963–2024 ACL; other materials are copyrighted by their respective copyright holders.en_AU
dc.rights.licenseCreative Commons Attribution 4.0 International Licenseen_AU
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_AU
dc.sourceThe influence of background data size on the performance of a score-based likelihood ratio system: A case of forensic text comparisonen_AU
dc.titleThe influence of background data size on the performance of a score-based likelihood ratio system: A case of forensic text comparisonen_AU
dc.typeConference paperen_AU
dcterms.accessRightsOpen Accessen_AU
local.bibliographicCitation.lastpage11en_AU
local.bibliographicCitation.startpage1en_AU
local.contributor.affiliationIshihara, Shunichi, College of Asia and the Pacific, ANUen_AU
local.contributor.authoremailu9504440@anu.edu.auen_AU
local.contributor.authoruidIshihara, Shunichi, u9504440en_AU
local.description.notesImported from ARIESen_AU
local.description.refereedYes
local.identifier.absfor460404 - Digital forensicsen_AU
local.identifier.absfor470403 - Computational linguisticsen_AU
local.identifier.absfor460208 - Natural language processingen_AU
local.identifier.absseo220301 - Digital humanitiesen_AU
local.identifier.absseo220402 - Applied computingen_AU
local.identifier.absseo130202 - Languages and linguisticsen_AU
local.identifier.ariespublicationu3391657xPUB216en_AU
local.identifier.uidSubmittedByu3391657en_AU
local.publisher.urlhttps://aclanthology.org/en_AU
local.type.statusPublished Versionen_AU

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2020.alta-1.3.pdf
Size:
1.07 MB
Format:
Adobe Portable Document Format
Description:
Back to topicon-arrow-up-solid
 
APRU
IARU
 
edX
Group of Eight Member

Acknowledgement of Country

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.


Contact ANUCopyrightDisclaimerPrivacyFreedom of Information

+61 2 6125 5111 The Australian National University, Canberra

TEQSA Provider ID: PRV12002 (Australian University) CRICOS Provider Code: 00120C ABN: 52 234 063 906