Formant dynamics and durations of um improve the performance of automatic speaker recognition systems

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We assess the potential improvement in the performance of MFCC-based automatic speaker recognition (ASR) systems with the inclusion of linguistic-phonetic information. Likelihood ratios were computed using MFCCs and the formant trajectories and durations of the hesitation marker um, extracted from recordings of male standard southern British English speakers. Testing was run over 20 replications using randomised sets of speakers. System validity (EER and Cllr) was found to improve with the inclusion of um relative to the baseline ASR across all 20 replications. These results offer support for the growing integration of automatic and linguistic-phonetic methods in forensic voice comparison.
Original languageEnglish
Title of host publicationProceedings of the 16th Australasian Conference on Speech Science and Technology (ASSTA)
Place of PublicationUniversity of Western Sydney, Australia
Publication statusPublished - 2016

Cite this