TY - JOUR
T1 - ProSynth: An Integrated Prosodic Approach to Device-Independent, Natural-Sounding Speech Synthesis
AU - Ogden, Richard
AU - Hawkins, Sarah
AU - House, Jill
AU - Huckvale, Mark
AU - Local, John
AU - Carter, Paul
AU - Dankovicová, Jana
AU - Heid, Sebastian
PY - 2003
Y1 - 2003
N2 - This paper outlines ProSynth, an approach to speech synthesis which takes a rich linguistic structure as central to the generation of natural-sounding speech. We start from the assumption that the acoustic richness of the speech signal reflects linguistic structural richness and underlies the percept of naturalness. Naturalness achieved by paying attention to systematic phonetic detail in the spectral, temporal and intonational domains produces a perceptually robust signal that is intelligible in adverse listening conditions. ProSynth uses syntactic and phonological parses to model the fine acoustic–phonetic detail of real speech. We present examples of our approach to modelling systematic segmental, temporal and intonational detail and show how all are integrated in the prosodic structure. Preliminary tests to evaluate the effects of modelling systematic fine spectral detail, timing, and intonation suggest that the approach increases intelligibility and naturalness.
AB - This paper outlines ProSynth, an approach to speech synthesis which takes a rich linguistic structure as central to the generation of natural-sounding speech. We start from the assumption that the acoustic richness of the speech signal reflects linguistic structural richness and underlies the percept of naturalness. Naturalness achieved by paying attention to systematic phonetic detail in the spectral, temporal and intonational domains produces a perceptually robust signal that is intelligible in adverse listening conditions. ProSynth uses syntactic and phonological parses to model the fine acoustic–phonetic detail of real speech. We present examples of our approach to modelling systematic segmental, temporal and intonational detail and show how all are integrated in the prosodic structure. Preliminary tests to evaluate the effects of modelling systematic fine spectral detail, timing, and intonation suggest that the approach increases intelligibility and naturalness.
U2 - 10.1006/csla.2000.0141
DO - 10.1006/csla.2000.0141
M3 - Article
VL - 14
SP - 177
EP - 210
JO - Computer Speech and Language
JF - Computer Speech and Language
ER -