Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh

Research output: Contribution to journalArticle

Full text download(s)

Published copy (DOI)

Author(s)

Department/unit(s)

Publication details

JournalIEEE/ACM Transactions on Audio, Speech, and Language Processing
DateAccepted/In press - 31 Oct 2017
DatePublished (current) - 17 Nov 2017
Issue number2
Volume26
Number of pages13
Pages (from-to)243-255
Original languageEnglish

Abstract

Articulatory speech synthesis has the potential to offer more natural sounding synthetic speech than established concatenative or parametric synthesis methods. Time-domain acoustic models are particularly suited to the dynamic nature of the speech signal, and recent work has demonstrated the potential of dynamic vocal tract models that accurately reproduce the vocal tract geometry. This paper presents a dynamic 3D digital waveguide mesh (DWM) vocal tract model, capable of movement to produce diphthongs. The technique is compared to existing dynamic 2D and static 3D DWM models, for both monophthongs and diphthongs. The results indicate that the proposed model provides improved formant accuracy over existing DWM vocal tract models. Furthermore, the computational requirements of the proposed method are significantly lower than those of comparable dynamic simulation techniques. This paper represents another step toward a fully functional articulatory vocal tract model which will lead to more natural speech synthesis systems for use across society.

Bibliographical note

© 2017 IEEE. This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.

    Research areas

  • Speech synthesis, digital waveguide mesh, diphthongs, numerical acoustic modeling

Discover related content

Find related publications, people, projects, datasets and more using interactive charts.

View graph of relations