Voice quality in telephone speech: Comparing acoustic measures between VoIP telephone and high-quality recordings

Chenzi Xu, Jessica Hazel Wormald, Paul Foulkes, Philip Harrison, Vincent Hughes, Poppy Welch, Finnian Kelly, David van der Vloed

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Implementing objective voice quality analysis in a forensic context is challenging. Forensic samples often involve telephone transmission, yet little is known about the impact of telecommunication channels on the acoustic measures of voice quality. This study compares the acoustics of laryngeal voice qualities (breathy, creaky, and modal) in controlled production of continuous English speech under two recording conditions: studio (headband microphone) and VoIP (simultaneously over a telephone line). A wide range of voice quality measures were extracted, including spectral tilts and harmonics-tonoise ratios, cepstral peak prominence (CPP), f0, and formants. Through comparative acoustic and linear discriminant analysis, this study identifies measures susceptible to recording conditions and those that robustly contribute to the differentiation of voice qualities in telephone recordings. Harmonic amplitudes H1H2c and H1c, CPP, and f0 are most reliable voice quality measures across studio and VoIP conditions.
Original languageEnglish
Title of host publicationProceedings of Interspeech
Place of PublicationKos, Greece
Pages1570-1574
Number of pages5
DOIs
Publication statusPublished - 5 Sept 2024
EventInterspeech 2024 - Kos, Greece
Duration: 1 Sept 20245 Sept 2024

Conference

ConferenceInterspeech 2024
Country/TerritoryGreece
CityKos
Period1/09/245/09/24

Cite this