Projects - UCLA SPAPL

Collaborative Research: Improving speech technology for better learning outcomes: the case of AAE child speakers
The goal of this project is to develop new spoken language processing technology to enable interactive dialog between children and a virtual agent to support literacy learning and assessment, with a focus on serving underrepresented communities. Many AAE-speaking children struggle with literacy but spoken language systems that could deliver effective interventions are much less effective when used with AAE speakers, as they are seldom included in the samples used to train speech recognition or TTS systems. While our focus is on one dialect (AAE), the goal is to develop methods that can be applied to other dialects, so we focus on the scenario of learning from limited data. Since studies have shown that ASR performance on adult AAE is much worse than that for GAE, and we know that recognizing children’s speech is more difficult than adults, our assessment of the technology impact on learning leverages a constrained dialog task with initial experiments in a Wizard-of-Oz (WoZ) setting. (details)
Voice Source Project
In voiced speech, the vocal folds open and close quasi-periodically and thus convert the glottal air flow (air volume velocity) into a train of flow pulses which is referred to as the voice source excitation signal. Early models of the source signal used a simple impulse train for modeling voiced excitation. None of these models has been calibrated with direct observations of glottal area changes which are the proximal cause of the air pressure changes that we hear as sound.The effective study of the voice source thus requires both more accurate source models and a comprehensive set of underlying observations on which to base the models. The primary goal of the proposed research is to develop and evaluate a new, more powerful source model based on direct observations of vocal fold vibrations... (details)
The Subglottal Resonances: Research and Applications	↑Top

Bird songs are important in the communication between birds of specific species. A bird can listen to other birds and classify them as conspecific or heterospecific, neighbor or stranger, mate or non-mate, kin or non-kin. It can also sing to other birds for mate attraction, danger alert, or territory defense. Behavioral and ecological studies could benefit from automatically detecting and identifying species from acoustic recordings.

Technologically Based Assessment of Language and Literacy (TBALL)

↑Top

details

From MRI and Acoustic Data to Articulatory Synthesis

↑Top

details

Speech Coding and Echo Cancellation for Wireless Communication

↑Top

Design of high quality speech coders and echo-cancelation schemes for wireless networks is a challenging task since good quality should be maintained with low power consumption under time-varying channel conditions and limited bandwidth. The design should account for a number of parameters such as bit rate, delay, power consumption, complexity, and quality of coded speech. Available bandwidth will depend on network protocols. Depending on the application, a set of parameters is optimized... (details)