ProSynth: An Integrated Prosodic Approach to Device-Independent, Natural-Sounding Speech Synthesis

Richard Ogden, Sarah Hawkins, Jill House, Mark Huckvale, John Local, Paul Carter, Jana Dankovicová, Sebastian Heid

Research output: Contribution to journalArticlepeer-review

Abstract

This paper outlines ProSynth, an approach to speech synthesis which takes a rich linguistic structure as central to the generation of natural-sounding speech. We start from the assumption that the acoustic richness of the speech signal reflects linguistic structural richness and underlies the percept of naturalness. Naturalness achieved by paying attention to systematic phonetic detail in the spectral, temporal and intonational domains produces a perceptually robust signal that is intelligible in adverse listening conditions. ProSynth uses syntactic and phonological parses to model the fine acoustic–phonetic detail of real speech. We present examples of our approach to modelling systematic segmental, temporal and intonational detail and show how all are integrated in the prosodic structure. Preliminary tests to evaluate the effects of modelling systematic fine spectral detail, timing, and intonation suggest that the approach increases intelligibility and naturalness.
Original languageEnglish
Pages (from-to)177-210
Number of pages34
JournalComputer Speech and Language
Volume14
DOIs
Publication statusPublished - 2003

Cite this