Brian Strope


A model of dynamic auditory perception and its application to robust word recognition

B. Strope and A. Alwan

Abstract

This paper describes two mechanisms which augment the common ASR front end and provide adaptation and isolation of local spectral peaks. A dynamic model consisting of a linear filter bank with a novel additive logarithmic adaptation stage after each filter output is proposed. An extensive series of perceptual forward masking experiments, together with previously reported forward masking data, determine the model's dynamic parameters. Once parameterized, the simple exponential dynamic mechanism predicts the nature of forward masking data from several studies across wide ranging frequencies, input levels, and probe delay times. An initial evaluation of the dynamic model together with a local peak isolation mechanism as a front end for DTW and HMM word recognition systems shows an improvement in robustness to background noise when compared to MFCC, LPCC, and RASTA-based front ends.

Available in postscript:


[UCLA] [EE] [SPAPL] [bps] [research]
bps@ucla.edu