Thanks to Prof. Abeer Alwan's guidance and support, I received my PhD degree in Electrical Engineering at the end of 2011.

My gratitude also goes to Prof. Daniel Blumstein, Mihaela van der Schaar, and Kung Yao for being on my doctoral committee. My dissertation "Noise Robust Signal Processing for Human Pitch Tracking and Bird Song Classification and Detection" and defense slides are available for download (*) (**).

I am working on speech processing, speech recognition, and language identification as a speech scientist at Voci Technologies. If you would like to initiate a discussion with me, please feel free to drop a line to me:

(*) The statistical algorithm for F0 estimation (SAFE) toolkit is available for download from here! Welcome to test it!
(**) The Rocky Mountain Biological Laboratory Robin song (RMBL-Robin) database is available for download from here! Feedbacks are welcomed!


W. Chu and A. Alwan, “SAFE: a statistical approach to F0 estimation under clean and noisy conditions,” IEEE Trans. on Audio, Speech, and Language Processing, Vol. 20, No. 3, pp. 933-967, 2012. [slides] [toolkit] [toolkit's tutorial]

W. Chu and A. Alwan, “fbEM: a filter bank EM algorithm for the joint optimization of features and acoustic model parameters in bird call classification,” Interspeech 2012, pp. 1993-1996. [poster]

W. Chu and D.T. Blumstein, “Noise robust bird song detection using syllable pattern-based hidden Markov models,” ICASSP 2011, pp. 345-348. [poster] [database]

W. Chu and A. Alwan, "SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech," Interspeech 2010, pp. 2590-2593. [slides] [toolkit] [toolkit's tutorial]

W. Chu and A. Alwan, “A correlation-maximization denoising filter used as an enhancement frontend for noise robust bird call classification,” InterSpeech 2009, pp. 2831-2834. [slides] [database]

W. Chu and A. Alwan, "Reducing F0 frame error of F0 tracking algorithms under noisy conditions with an unvoiced/voiced classification frontend," ICASSP 2009, pp.3969-3972. [slides]

W. Chu and J. Liu, "Using Confidence Measures to Evaluate the Speaker Turns in Speaker Segmentation," Proc of Intl Conf on Information Sciences, Signal Processing and its Application (ISSPA07).

W Chu and J. Liu, "Subband Energy Distance Measure Applied in Multi-Pass Speech/Non-Speech Discrimination," Proc of Intl Conf on Information Sciences, Signal Processing and its Application (ISSPA07).

W. Chu, X. Xiao, J. Liu, "Confidence Score Based Unsupervised Incremental Adaptation for OOV Words Detection," Proc of Intl Workshops on Statistical Techniques in Pattern Recognition (SSSPR06), pp.723-731.


Voci Technologies 01/2012 - present
Speech Scientist
– Speech processing, speech recognition, and language recognition.

Speech Processing and Audio Perception Lab, UCLA 09/2007 - 12/2011
Research Assistant, Advisor: Prof. Abeer Alwan
Noise robust F0 estimation and tracking
    * Proposed SAFE - a Statistical Algorithm for F0 Estimation under both clean and noisy condition. The statistical framework is promising in modeling the effect of the noise on Prominent SNR Peaks in the spectra given F0. Working on incorporating statistical modeling of F0 transition into SAFE to deliver an F0 tracker.
    * Proposed an error metric called F0 Frame Error which is a combination of Gross Pitch Error and Voice Decision Error to compare the performance of F0 tracking algorithms in a unified framework. Used a statistical-based voiced/unvoiced classification frontend to reduce Voice Decision Errors under noisy conditions.
Bird song classification, recognition, and detection
    * Extended the EM algorithm to jointly estimate optimal center frequencies and bandwidths of the filter bank in cepstral feature extraction, and model parameters in bird call classification. Proposed an extended auxiliary function in which feature extraction and model parameters are updated iteratively and alternatively.
    * Used hierarchical clustering analysis to infer bird syllable patterns for finer acoustic modeling. Compared to using one single general pattern for all syllables, both of the precision and recall rates of the syllable pattern-based HMM bird song detector are increased. The algorithm is being transplanted onto a hand-held device.
    * Proposed a correlation-maximization denoising filter for reducing the non-periodic noise in the bird calls which have periodic structure. Compared to the Wiener filter, features extracted from the output of the proposed filter resulted in a lower bird call classification error rate.

Speech Group, Disney Research, Pittsburgh 06/2010 - 09/2010
Summer Intern, Mentor: Dr. John McDonough and Prof. Bhiksha Raj
– Used microphone array processing and speech recognition technologies to build an interactively storytelling demo for children. Understood Acoustic Echo Cancellation and Weighted Finite State Transducer-based speech recognition. Learned how to collect, annotate, and maintain an audio-visual children speech database..

Speech Lab, Rosetta Stone 06/2009 - 08/2009
Summer Intern, Mentor: Dr. Bryan Pellom and Dr. Kadri Hacioglu
– Developed statistical-based methods for deciding the pronunciation of a word. Understood the rule-based and maximum entropy criterion-based modelling techniques used in Machine Translation and applied them in the Letter-To-Sound conversion. Wrote an A* search routine in C++.

Speech Group, Mitsubishi Electric Research Lab 06/2008 - 09/2008
Summer Intern, Mentor: Prof. Bhiksha Raj
– Developed a discriminative training module (lattice-based MMI) on Sphinx speech recognizer. Also explored how initial model parameters can affect the final model parameters in an iterative learning process. Understood the Maximum Likelihood estimation, the Baum-Welch algorithm, and the Extended Baum-Welch algorithm.

Speech Group, Microsoft Research Asia, Beijing 04/2007 - 08/2007
Summer Intern, Mentor: Dr. Chao Huang
– Built a demo for detecting acoustic events (speech, music, ring tone, background noise) in an office environment. Compared the effectiveness of noise robust features, Gaussian mixture model and hidden Markov model, MAP and MLLR unsupervised adaptations. Learned how to manage job queues on computing clusters.

Microprocessor Tech Lab, Intel China Research Center, Beijing 07/2006 - 10/2006
Research Intern, Mentor: Dr. Wei Hu
– Built a demo for locating and tracking the voice of actors and actresses in TV series and movies. Used Bayesian Information Criterion to unsupervisedly segment and cluster speakers in the audio stream.

Tsinghua University, Beijing 09/2004 - 07/2007
Research Assistant, Advisor: Prof. Jia Liu
Master thesis work: implemented a real-time Speech-To-Text system with a non-speech input rejection frontend on chip. Developed a non-speech removal frontend for national '863' and '242' keyword spotting evaluation.

UFIDA Software Corp., Beijing 02/2004 - 06/2004
Software Intern, Supervisor: Mr. Yu Zhu
– Bachelor thesis work: created the index of the digital map for an on-vehicle GPS software system..