Extracting amplitude modulations from speech in the time domain

Research output: Contribution to journalArticlepeer-review


Natural sounds can be characterised by patterns of changes in loudness (amplitude modulations), and human speech perception studies have focused on the low frequencies contained in the gross temporal structure of speech. Low-pass filtering the temporal envelopes of sub-band filtered speech maintains intelligibility, but it remains unclear how the human auditory system could perform such a modulation domain analysis or even if it does so at all. It is difficult to further manipulate amplitude modulations through frequency-domain filtering to investigate cues the system may use. The current work focuses on a time-domain decomposition of filter output envelopes into pulses of amplitude modulation. The technique demonstrates that signals low-pass filtered in the modulation domain maintain bursts of energy which are comparable to those that can be extracted entirely within the time-domain. This paper presents preliminary work that suggests a time-domain approach, which focuses on the instantaneous features of transient changes in loudness, can be used to study the content of human speech. This approach should be pursued as it allows human speech intelligibility mechanisms to be investigated from a new perspective.

Keywords: Speech; Amplitude modulation; Vocoder; Intelligibility
Original languageEnglish
Pages (from-to)903-913
Number of pages11
JournalSpeech Communication
Issue number6
Early online date17 Mar 2011
Publication statusPublished - 1 Jul 2011

Cite this