I received my B.S. degree from Department of Electronic Engineering, Tsinghua Unvierstiy, Beijing, P.R. China in 2008. Currently, I'm a Ph.D candidate in Prof. Abeer Alwan's speech lab.
My research area includes: speech production system modeling, voice source estimation, voice quality analysis, signal processing for clinical assessment, speech recognition, and speech activity detection. Current research projects includes modeling the glottal sources signal, synthesizing natural voices using the glottal model, and perceptual validation.

CV download.


Publications

Journal Publications

Gang Chen, Jody Kreiman, Abeer Alwan, "The glottaltopogram: a method of analyzing high-speed images of the vocal folds", Computer Speech and Language, in press. [link to the journal article] [toolkit].

Gang Chen, Jody Kreiman, Bruce Gerratt, Juergen Neubauer, Yen-Liang Shue, and Abeer Alwan, "Development of a glottal area index that integrates glottal gap size and open quotient," Journal of the Acoustical Society of America, Vol. 133, Issue 3, March 2013, pp. 1656–1666. [link to the journal article]

Jody Kreiman, Yen-Liang Shue, Gang Chen, Markus Iseli, Bruce R. Gerratt, Juergen Neubauer, and Abeer Alwan, "Variability in the relationships among voice quality, harmonic amplitudes, open quotient, and glottal area waveform shape in sustained phonation," Journal of the Acoustical Society of America, Volume 132, Issue 4, pp. 2625-2632 (2012). [link to the journal article]

Conference papers

G. Chen, M. Garellek, J. Kreiman, B. R. Gerratt, A. Alwan, "A perceptually and physiologically motivated voice source model", Interspeech 2013, pp. 2001-2005. [Best student paper award finalist] [slides and audio samples]

G. Chen, R. A. Samlan, J. Kreiman, A. Alwan, "Investigating the relationship between glottal area waveform shape and harmonic magnitudes through computational modeling and laryngeal high-speed videoendoscopy", Interspeech 2013, pp. 3216-3220. [poster]

Jody Kreiman, Marc Garellek, Gang Chen, Abeer Alwan, and Bruce R.Gerratt, "Perceptual evaluation of source models," International Conference on Voice Physiology and Biomechanics, 2012

G. Chen, Y.-L. Shue, J. Kreiman, and A. Alwan, "Estimating the voice source in noise", Interspeech 2012.

G. Chen, J. Kreiman, and A. Alwan, "The Glottaltopograph: A Method of Analyzing High-Speed Images of the Vocal Folds", ICASSP 2012, pp. 3985-3988. [toolkit]

G. Chen, J. Kreiman, Yen-Liang Shue, and A. Alwan, "Acoustic Correlates of Glottal Gaps," Interspeech 2011, pp 2673-2676

Y.-L. Shue, G. Chen, and A. Alwan, "On the Interdependencies between Voice Quality, Glottal Gaps, and Voice-Source related Acoustic Measures," Interspeech 2010, pp. 34-37.

G. Chen, X. Feng, Y.-L. Shue, and A. Alwan, "On Using Voice Source Measures in Automatic Gender Classification of Children's Speech," Interspeech 2010, pp. 673-676.


Research Experiences

Speech Processing and Audio Perception Lab, UCLA 09/2008 - present
Research Assistant, Advisor: Prof. Abeer Alwan
       -  Applied voice source measures in automatic gender classification on children's voice. Implemented SVM classifier based on fundamental frequency and formant frequencies with additional voice source related features. Classification accuracies were improved by 4.4% on average for all age groups (age 8- age 17).
       -  Analyzed high-speed image data of the vocal folds from various voice qualities. Proposed a new voice source signal model of human speech production system. Applied the proposed voice source signal model to automatic voice source estimation from the acoustic speech signal in noise. The proposed method outperformed state-of-the-art source estimation algorithms.
       -  Developed a statistical method ''glottaltopograph'' to automatically visualize and analyze high-speed vocal-fold video recording. The proposed method could automatically locate the problematic region of the laryngeal area for clinical assessment

Internship Experiences

Qualcomm, San Diego, CA.   Jun 2013-Sep 2013
Interim Engineering Intern 

    -Audio systems design and implementation.

กก

• Signal processing group, Starkey Lab, Eden Prairie, MN.   Jun 2012-Sep 2012
Summer intern, Mentor: Dr.Ivo Merks 

    -Speech dereverberation for hearing aid applications. Applied a statistical room acoustic model to estimate the reverberation time. Estimated the late reflections and removed them via spectral subtraction to enhance the speech signal. The proposed algorithm is able to reliably estimate the reverberation time in noise with a low complexity for hearing aid devices.

• Speech group, Disney Research, Pittsburgh, PA.   Jun 2011-Sep 2011
Summer intern, Mentors: Dr.Kenichi Kumatani and Dr. John McDonough

    -Developed algorithms for multiple-speaker voice activity detection in Python and C++. Implemented this front-end processing in a speech recognition system of an interactive game for multiple children. This is a prototype developed for Disneyland theme parks.

• Hardware group, 3M Cogent, Pasadena, CA.         Jul 2010-Sep 2010
Summer intern, Mentor: Dr. Charley Lu   

    -Developed algorithms for fingerprint capturing and enhancement on WinCE/Window Mobile platforms. Designed GUI for mobile device.
    -Applied noise reduction on front-end of Speaker Identification project in mobile device. Detected and updated noise spectrum in real time. Designed filter banks and applied spectral subtraction to enhance the speech.

EMC Beijing/ Hong Kong, China                               Jul 2007- Sep2007
Summer intern
-Performed database management (MySql) in projects with City University of Hong Kong