Cultural advice

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.

Aboriginal and Torres Strait Islander peoples are advised that ANU Library collections may include images, names, voices, and other representations of deceased persons.

Material in the collection may contain terms, language or views that reflect the period in which the item was created and may be considered inappropriate today.

A Likelihood-Ratio Based Forensic Voice Comparison in Standard Thai

Loading...
Thumbnail Image

Date

Authors

Pingjai, Supawan

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This research uses a likelihood ratio (LR) framework to assess the discriminatory power of a range of acoustic parameters extracted from speech samples produced by male speakers of Standard Thai. The thesis aims to answer two main questions: 1) to what extent the tested linguistic-phonetic segments of Standard Thai perform in forensic voice comparison (FVC); and 2) how such linguistic-phonetic segments are profitably combined through logistic regression using the FoCal Toolkit (Brümmer, 2007). The segments focused on in this study are the four consonants /s, ʨh, n, m/ and the two diphthongs [ɔi, ai]. First of all, using the alveolar fricative /s/, two different sets of features were compared in terms of their performance in FVC. The first comprised the spectrum-based distributional features of four spectral moments, namely mean, variance, skew and kurtosis; the second consisted of the coefficients of the Discrete Cosine Transform (DCTs) applied to a spectrum. As DCTs were found to perform better, they were subsequently used to model the consonant spectrum of the remaining consonants. The consonant spectrum was extracted at the center point of the /s, ʨh, n, m/ consonants with a Hamming window of 31.25 msec. For the diphthongs [ɔi] - [nɔi L] and [ai] - [mai HL], the cubic polynomials fitted to the F2 and F1-F3 formants were tested separately. The quadratic polynomials fitted to the tonal F0 contours of [ɔi] - [nɔi L] and [ai] - [mai HL] were tested as well. Long-term F0 distribution (LTF0) was also trialed. The results show the promising discriminatory power of the Standard Thai acoustic features and segments tested in this thesis. The main findings are as follows. 1. The fricative /s/ performed better with the DCTs (Cllr = 0.70) than with the spectral moments (Cllr = 0.92). 2. The nasals /n, m/ (Cllr = 0.47) performed better than the affricate /tɕh/ (Cllr = 0.54) and the fricative /s/ (Cllr = 0.70) when their DCT coefficients were parameterized. 3. F1-F3 trajectories (Cllr = 0.42 and Cllr = 0.49) outperformed F2 trajectory (Cllr = 0.69 and Cllr = 0.67) for both diphthongs [ɔi] and [ai]. 4. F1-F3 trajectories of the diphthong [ɔi] (Cllr = 0.42) outperformed those of [ai] (Cllr = 0.49). 5. Tonal F0 (Cllr = 0.52) outperformed LTF0 (Cllr = 0.74). 6. Overall, better results were obtained when DCTs of /n/ - [na: HL] and /n/ - [nɔi L] were fused. (Cllr = 0.40 with the largest consistent-with-fact SSLog10LR = 2.53). In light of the findings, we can conclude that Standard Thai is generally amenable to FVC, especially when linguistic-phonetic segments are being combined; it is recommended that the latter procedure be followed when dealing with forensically realistic casework.

Description

Citation

Source

Book Title

Entity type

Access Statement

License Rights

Restricted until

abcd