Abstract
Implementing objective voice quality analysis in a forensic context is challenging. Forensic samples often involve telephone transmission, yet little is known about the impact of telecommunication channels on the acoustic measures of voice quality. This study compares the acoustics of laryngeal voice qualities (breathy, creaky, and modal) in controlled production of continuous English speech under two recording conditions: studio (headband microphone) and VoIP (simultaneously over a telephone line). A wide range of voice quality measures were extracted, including spectral tilts and harmonics-tonoise ratios, cepstral peak prominence (CPP), f0, and formants. Through comparative acoustic and linear discriminant analysis, this study identifies measures susceptible to recording conditions and those that robustly contribute to the differentiation of voice qualities in telephone recordings. Harmonic amplitudes H1H2c and H1c, CPP, and f0 are most reliable voice quality measures across studio and VoIP conditions.
Original language | English |
---|---|
Title of host publication | Proceedings of Interspeech |
Place of Publication | Kos, Greece |
Pages | 1570-1574 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 5 Sept 2024 |
Event | Interspeech 2024 - Kos, Greece Duration: 1 Sept 2024 → 5 Sept 2024 |
Conference
Conference | Interspeech 2024 |
---|---|
Country/Territory | Greece |
City | Kos |
Period | 1/09/24 → 5/09/24 |