The Improvement of Speech Visualisation and Its Application in Computer-Assisted Pronunciation Training
Abstract
Computer-Assisted Pronunciation Training (CAPT) is a methodology for language learners to acquire new pronunciations in new languages with support from computer software. With the improvement of computational linguistics, language learners can use software with better quality to help themselves in pronunciation acquisition and receive precise feedback on the new language.
The research goal of this Master of Philosophy thesis in computer science was, as a complement to traditional in-class teaching, to design, develop, and evaluate a software prototype for language learners to acquire new pronunciations in their desired languages. Unlike traditional language training, which relies solely on listening to acquire a new language, this study proposed a CAPT software for pronunciation visualisation via vowel space plots to help language learners understand pronunciations in an easy-understanding and interesting manner. A series of studies was set to approach this research goal and evaluate the performance of the proposed prototype.
The overall purpose of these studies was (1) to provide a straightforward representation mechanism for learners to perceive pronunciations, (2) to personalise pronunciation support for different language learners, and (3) to verify the utility of the proposed mechanism to help language learners in their pronunciation acquisition and language practice.
The three studies of this thesis explored multiple methods, underpinned by the related literature, to present pronunciations to language learners, choosing vowel space plots as the tool to reflect the input speech signal from the learner. Study 1 clarified the functionality of vowel space plots and verified the feasibility and accessibility of implementing vowel space plots. Study 2 discussed personalised pronunciation support for speakers with different biological oral features. It proposed a new vowel normalisation algorithm to process the audio data from speakers and provide better vowel space plots. The processed vowel space plots were evaluated as more comparable with the provided gold standard and could better reflect the pronunciation. Study 3 focused on Human-Computer Interaction (HCI) between potential language learners and the proposed prototype. This user study characterised language learners' requirements in a real situation and positively influenced the development of the prototype.
The three principal contributions in this thesis were as follows:
(1) The prototype itself: the prototype allows user to practice their pronunciation by themselves with visualised support and compare their pronunciations with the provided gold standard.
(2) A new vowel normalisation algorithm for vowel space plot generation: the normalised vowel space plots provided more accurate vowel space plots than the unprocessed plots.
(3) An HCI study for language educational software development: a series of experimental evaluations with language students as participants provided a rich range of insights into the participating students' needs, current performance of the prototype as a way to address some of those needs and areas of future work. It also highlighted the benefits of the use of design science research and design thinking methodologies for designing and developing iterations of improved CAPT technologies.
This thesis offers a possible direction for language education with computer assistance in the future. A better method to help language learners perceive pronunciations could be a potential goal for transdisciplinary work in computer science, language education and applied linguistics.
Description
Citation
Collections
Source
Type
Book Title
Entity type
Access Statement
License Rights
Restricted until
2024-06-03
Downloads
File
Description
Thesis Material