We propose an efficient block least-squares (BLS) algorithm for acoustic echo cancellation. The high computation and memory requirements associated with a long room echo make the simple, gradient-based LMS filter a more acceptable commercial solution than a full-fledged LS canceler. However, the LMS echo canceler has slower convergence and worse steady-state performance than its LS counterpart. In the proposed BLS approach, the autocorrelation and cross-correlation of the source and echo, required in solving the LS normal equations, are performed once per block using W s . With appropriate data windowing the autocorrelation matrix is constrained to be Toeplitz, allowing the corresponding normal equations to be solved efficiently. The positive definiteness of the autocorrelation function eliminates the stability problems of other fast LS algorithms. BLS can reduce the echo residual to the level of background noise, allowing a residual power based, statistical near-end speech detector to be devised. Performance in real environments under various settings of filter length, SNR, near-end speech presence, etc., is investigated.
This paper proposes feature extraction methods for object classification with passive acoustic sensor networks deployed in (sub-)urban environments. We analyzed the emitted acoustic signals of three object classes: guns (muzzle blast), vehicles (running piston engine) and pedestrians (several footsteps). Based on the conducted analysis, methods are developed to extract features that are related to the physical nature of the objects. In addition, localization methods are developed (e.g. pseudo-matched-filter), because the object location is required for one of the feature extraction methods. As a result, we developed a proof-of-concept system to record and extract discriminative acoustic features. The performance of the features and the final classification are assessed with real measured data of the three object classes within sub-urban environment.
The Minimum Classification Error / Generalized Probablistic Descent (MCE/GPD) framework has been applied to several recognizer frameworks, such as hidden Markov models, prototype based systems, and systems based on artificial neural networks. However, to our knowledge, the MCE/GPD framework has not yet been applied to a working online speech recognition system in a realistic application environment. We here describe the application of MCE/GPD to a telephone-based multi-speaker speech recognition system that accepts spoken Japanese names and forwards calls to any of up to 400 staff members. Points of interest include the automatic collection and labeling of new training data and the use of MCE/GPD training to improve recognizer performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.