Skip navigation
Skip navigation

R-Norm: Improving Inter-Speaker Variability Modelling at the Score Level via Regression Score Normalisation

Vandyke, David; Wagner, Michael; Goecke, Roland

Description

This paper presents a new method of score post-processing which utilises previously hidden relationships among client models and test probes that are found within the scores produced by an automatic speaker recognition system. We suggest the name r-Norm (for Regression Normalisation) for the method, which can be viewed as both a score normalisation process and as a novel and improved modelling technique of inter-speaker variability. The key component of the method lies in learning a regression...[Show more]

dc.contributor.authorVandyke, David
dc.contributor.authorWagner, Michael
dc.contributor.authorGoecke, Roland
dc.coverage.spatialLyon France
dc.date.accessioned2015-12-10T23:22:56Z
dc.date.createdAugust 25-29 2013
dc.identifier.urihttp://hdl.handle.net/1885/66730
dc.description.abstractThis paper presents a new method of score post-processing which utilises previously hidden relationships among client models and test probes that are found within the scores produced by an automatic speaker recognition system. We suggest the name r-Norm (for Regression Normalisation) for the method, which can be viewed as both a score normalisation process and as a novel and improved modelling technique of inter-speaker variability. The key component of the method lies in learning a regression model between development data scores and an 'ideal' score matrix, which can either be derived from clean data or created synthetically. To generate scores for experimental validation of the proposed idea we perform a classic GMM-UBM experiment employing mel-cepstral features on the 1sp-female task of the NIST 2003 SRE corpus. Comparisons of the r-Norm results are made with standard score postprocessing/ normalisation methods t-Norm and z-Norm. The r- Norm method is shown to perform very strongly, improving the EER from 18.5% to 7.01%, significantly outperforming both z-Norm and t-Norm in this case. The baseline system performance was deemed acceptable for the aims of this experiment, which were focused on evaluating and comparing the performance of the proposed r-Norm idea.
dc.publisherISCA Speech Organisation
dc.relation.ispartofseries14th Annual conference of the International Speech Communication Association INTERSPEECH 2013
dc.rightsAuthor/s retain copyright
dc.sourceCharacterising Depressed Speech for Classification
dc.titleR-Norm: Improving Inter-Speaker Variability Modelling at the Score Level via Regression Score Normalisation
dc.typeConference paper
local.description.notesImported from ARIES
local.description.refereedYes
dc.date.issued2013
local.identifier.absfor080602 - Computer-Human Interaction
local.identifier.ariespublicationu4334215xPUB1333
local.type.statusPublished Version
local.contributor.affiliationVandyke, David, University of Canberra
local.contributor.affiliationWagner, Michael, College of Engineering and Computer Science, ANU
local.contributor.affiliationGoecke, Roland, College of Engineering and Computer Science, ANU
local.bibliographicCitation.startpage1
local.bibliographicCitation.lastpage5
local.identifier.absseo970108 - Expanding Knowledge in the Information and Computing Sciences
local.identifier.absseo970111 - Expanding Knowledge in the Medical and Health Sciences
dc.date.updated2015-12-10T10:35:34Z
local.identifier.scopusID2-s2.0-84906249191
dcterms.accessRightsOpen Access
CollectionsANU Research Publications

Download

File Description SizeFormat Image
01_Vandyke_R-Norm:_Improving_2013.pdf286.76 kBAdobe PDF
02_Vandyke_R-Norm:_Improving_2013.pdf236.42 kBAdobe PDF


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  19 May 2020/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator