The Subglottal Resonance: Research and Applications

[ Project Summary | Keywords | Project References]

Project Summary

During the past few decades, research efforts in the area of speech processing have focused on the extraction of reliable acoustic features for applications such as automatic speech recognition, speaker identification, and speech coding, among many others. These acoustic features are related either to the vocal tract (filter), or to the glottal air flow (source) that drives it. Although the mechanics of the supraglottal (above the glottis) system have been well understood, the subglottal (below the glottis) system and its properties have not been explored in great detail.

Unlike the supraglottal tract, the configuration of the subglottal system remains fairly constant during speech production, which makes its properties very interesting and useful. In particular, its resonant frequencies, through subtle interactions with the speech signal, are believed to have the potential to minimize acoustic differences among speakers and also to provide valuable information about a speaker's identity.

The major goals of this project are as follows.

(1) To collect a large database of simultaneously-recorded speech and subglottal acoustics (for adults as well as children).

(2) To analyze the data and understand the significance of subglottal acoustics (subglottal resonances or SGRs, in particular) in speech production, perception and technology.

(3) To develop improved models of the subglottal system based on analytical findings.

(4) To develop automatic algorithms for the estimation of SGRs and other subglottal parameters from speech signals, with application to speech-technology areas such as automatic speech recognition (ASR), body-height estimation and speaker identification/verification.

(5) To design and conduct speech-perception experiments that would help understand the role of SGRs in talker normalization by humans.

Work supported by NSF Grant No. 0905381.

Keywords

Subglottal Resonances, Acoustic Model, Speaker Normalization and Identification.

Project References

Jinxi Guo, Angli Liu, Harish Arsikere, Abeer Alwan and Steven M. Lulich, "The relationship between the second subglottal resonance and vowel class, standing height, trunk length, and F0 variation for Mandarin speakers" , Interspeech 2014, accepted.

Harish Arsikere, H.A. Gupta and Abeer Alwan, "Speaker recognition via fusion of subglottal features and MFCCs" , Interspeech 2014, accepted.

Harish Arsikere and Abeer Alwan, "Frequency warping using subglottal resonances: complementarity with VTLN and robustness to additive noise", ICASSP 2014, pp. 6354–6358.

Harish Arsikere, Steven M. Lulich and Abeer Alwan, "Estimating Speaker Height and Subglottal Resonances Using MFCCs and GMMs," IEEE Signal Processing Letters, Vol 21, Issue 2, pp. 159--162.

Harish Arsikere, Steven M. Lulich and Abeer Alwan, "Non-linear frequency warping for VTLN using subglottal resonances and the third formant frequency," ICASSP 2013, pp. 7922-7926.

Harish Arsikere, Gary K.F. Leung, Steven M. Lulich, and Abeer Alwan,"Automatic estimation of the first three subglottal resonances from adults’ speech signals with application to speaker height estimation," Speech Communication, Vol. 55, pp. 51-70, 2013. [link to the journal article]

Harish Arsikere, Gary K.F. Leung, Steven M. Lulich and Abeer Alwan, "Automatic estimation of the first two subglottal resonances in children's speech with application to speaker normalization in limited-data conditions," Interspeech 2012.

Harish Arsikere, Gary K.F. Leung, Steven M. Lulich and Abeer Alwan, "Automatic height estimation using the second subglottal resonance", ICASSP 2012, pp. 3989-3992.

S. Lulich, A. Alwan, H. Arsikere, J. Morton, and, M. Sommers, "Resonances and wave propagation velocity in the subglottal airways", Journal of the Acoustical Society of America, Volume 130, Issue 4, pp. 2108-2115, 2011.

S. Lulich, H. Arsikere, J. Morton, G. Leung, A. Alwan, and M. Sommers, "Analysis and automatic estimation of children's subglottal resonances," Interspeech 2011, pp 2817-2820

Harish Arsikere, Steven Lulich, and Abeer Alwan, "Automatic Estimation of the First Subglottal Resonance," Journal of the Acoustical Society of America (Express Letters), Vol. 129, Issue 5, pp. 197-203, May 2011.

Harish Arsikere, Steven Lulich, and Abeer Alwan, "Automatic Estimation of the Second Subglottal Resonance from Natural Speech," ICASSP 2011, 4616 - 4619.

S. Wang, S. Lulich, and A. Alwan, "Automatic detection of the second subglottal resonance and its application to speaker normalization," J. Acoust. Soc. Am, 2009. Volume 126, Issue 6, pp. 3268-3277.

S. Wang, Y.-H. Lee and A. Alwan, "Bark-shift based nonlinear speaker normalization using the second subglottal resonance," Interspeech 2009, pp. 1619-1622.

S. Wang, S.M. Lulich, and A. Alwan, " A reliable technique for detecting the second subglottal resonance and its use in cross-language speaker adaptation ," Interspeech 2008, pp. 1717-1720.

S. Wang, A. Alwan, and S. Lulich, " Speaker Normalization Based on Subglottal Resonances," ICASSP 2008, pp. 4277-4280.

Back to SPAPL Home Page.

Abeer Alwan (alwan@seas.ucla.edu)