A comparison of machine learning algorithms and human listeners in the identification of phonemic contrasts

Date

Authors

Reid, Paul
Gnevsheva, Ksenia
Suominen, Hanna

Journal Title

Journal ISSN

Volume Title

Publisher

The Australasian Speech Science and Technology Association, Inc.

Abstract

To elucidate the processes by which automatic speech recognition (ASR) architectures reach transcription decisions, our study compared human and ASR responses to stimuli with manipulated cues for stop manner (burst, silence, and vocalic onset) and voicing (voice onset time, aspiration amplitude, and vocalic onset). Fourteen participants and two ASR systems completed a forced-response identification task. Results indicated that the cues were of perceptual significance for human participants, and though weighted differently, significant predictors of ASR output. This demonstrated that ASR systems may be relying on the same key acoustic information as do human listeners for phonemic classification.

Description

Keywords

Citation

Source

Proceedings of the 18th Australasian International Conference on Speech Science and Technology

Book Title

Entity type

Access Statement

License Rights

Restricted until