Non-negative matrix deconvolution in noise robust speech recognition

Hurmalainen, Antti; Gemmeke, Jort; Virtanen, Tuomas
Abstract

High noise robustness has been achieved in speech recognition by using sparse exemplar-based methods with spectrogram windows spanning up to 300 ms. A downside is that a large exemplar dictionary is required to cover sufficiently many spectral patterns and their temporal alignments within windows. We propose a recognition system based on a shift-invariant convolutive model, where exemplar activations at all the possible temporal positions jointly reconstruct an utterance. Recognition rates are evaluated using the AURORA-2 database, containing spoken digits with noise ranging from clean speech to -5 dB SNR. We obtain results superior to those, where the activations were found independently for each overlapping window.

Keywords

non-negative matrix deconvolution; noise robustness; speech recognition

Research areas

Year:
2011
Book title:
Proceedings of International Conference on Audio, Speech and Signal Processing
Address:
Prague, Czech Republic
Organization:
IEEE Signal Processing Society
Month:
May