Abstract
Typically in articulatory speech synthesis, the 3-D shape of a vocal tract for a particular speech sound has been established, for example, by magnetic resonance imaging (MRI), and this is used to model the acoustic output from the tract using numerical methods that operate in either 1, 2 or 3 dimensions. The dimensionality strongly affects the overall computation complexity, which has a direct bearing on the quality of the synthesized speech output. The computational cost of 2-D Digital waveguide modelling makes it a practical technique for real-time synthesis in an average PC at full (20kHz) audio bandwidth. Thus, a 2-D Digital Waveguide Mesh (DWM) is proposed for this work, which is also commonly used in room acoustic modelling. The constrictions under consideration here include the full vocal tract closure associated with plosives (in English these are the consonants in ‘baa’, ‘pa’, ‘do’, ‘to’, ‘go’ and ‘coo’); all have an sudden release of acoustic energy when the constriction is released that is known as a ‘burst’. The centre frequency of the burst that relates to the vocal tract shape during the plosive closure is analysed as it is an important acoustic cue for consonant perception. In this work, all tract shapes are extracted from MRI recorded data.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2014 2nd IEEE China Summit & International Conference on Signal and Information Processing |
Publisher | IEEE |
Pages | 32-26 |
Number of pages | 5 |
ISBN (Print) | 978-1-4799-5401-8 |
DOIs | |
Publication status | Published - 9 Jul 2014 |
Event | 2014 2nd IEEE China Summit & International Conference on Signal and Information Processing, China, 9-13 July 2014 - Xi'an, China Duration: 9 Jul 2014 → 13 Jul 2014 |
Conference
Conference | 2014 2nd IEEE China Summit & International Conference on Signal and Information Processing, China, 9-13 July 2014 |
---|---|
Country/Territory | China |
City | Xi'an |
Period | 9/07/14 → 13/07/14 |