Correlating cepstra with formant frequencies: Implications for phonetically-informed forensic voice comparison

dc.contributor.authorHughes, Vincent
dc.contributor.authorClermont, Frantz
dc.contributor.authorHarrison, Philip
dc.coverage.spatialShanghai, China (online)
dc.date.accessioned2024-01-05T02:55:43Z
dc.date.created25-29 October 2020
dc.date.issued2020
dc.date.updated2022-09-18T08:16:19Z
dc.description.abstractA significant question for forensic voice comparison, and for speaker recognition more generally, is the extent to which different input features capture complementary speaker-specific information. Understanding complementarity allows us to make predictions about how combining methods using different features may produce better overall performance. In forensic contexts, it is also important to be able to explain to courts what information the underlying features are actually capturing. This paper addresses these issues by examining the extent to which MFCCs and LPCCs can predict F0, F1, F2, and F3 values using data extracted from the midpoint of the vocalic portion of the hesitation marker um for 89 speakers of standard southern British English. By-speaker correlations were calculated using multiple linear regression and performance was assessed using mean rho (?) values. Results show that the first two formants were more accurately predicted than F3 or F0. LPCCs consistently produced stronger correlations with the linguistic features than MFCCs, while increasing cepstral order up to 16 also increased the strength of the correlations. There was, however, considerable variability across speakers in terms of the accuracy of the predictions. We discuss the implications of these findings for forensic voice comparison.en_AU
dc.format.mimetypeapplication/pdfen_AU
dc.identifier.isbn9781713820697en_AU
dc.identifier.urihttp://hdl.handle.net/1885/311183
dc.language.isoen_AUen_AU
dc.publisherInternational Speech Communication Associationen_AU
dc.relation.ispartofseries21st Annual Conference of the International Speech Communication Association (INTERSPEECH 2020)en_AU
dc.rights© 2020 ISCAen_AU
dc.source21st Annual Conference of the International Speech Communication Association (INTERSPEECH 2020): Cognitive Intelligence for Speech Processingen_AU
dc.titleCorrelating cepstra with formant frequencies: Implications for phonetically-informed forensic voice comparisonen_AU
dc.typeConference paperen_AU
dcterms.accessRightsFree Access via publisher websiteen_AU
local.bibliographicCitation.lastpage1862en_AU
local.bibliographicCitation.startpage1858en_AU
local.contributor.affiliationHughes, Vincent, University of Yorken_AU
local.contributor.affiliationClermont, Frantz, College of Asia and the Pacific, ANUen_AU
local.contributor.affiliationHarrison, Philip, University of Yorken_AU
local.contributor.authoremailu3674215@anu.edu.auen_AU
local.contributor.authoruidClermont, Frantz, u3674215en_AU
local.description.embargo2099-12-31
local.description.notesImported from ARIESen_AU
local.description.refereedYes
local.identifier.absfor470401 - Applied linguistics and educational linguisticsen_AU
local.identifier.ariespublicationa383154xPUB16957en_AU
local.identifier.doi10.21437/Interspeech.2020-2216en_AU
local.identifier.scopusID2-s2.0-85098154787
local.identifier.uidSubmittedBya383154en_AU
local.publisher.urlhttps://www.isca-speech.org/archive/interspeech_2020/hughes20_interspeech.htmlen_AU
local.type.statusPublished Versionen_AU

Downloads

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Correlating cepstra with formant frequencies.pdf
Size:
901.01 KB
Format:
Adobe Portable Document Format
Description: