On noise masking for automatic missing data speech recognition: a survey and discussion

2007-On noise masking for automatic missing data speech recognition a survey and discussion

In order to restrict the range of methods, only the techniques using a single microphone are considered.

A similar principle is used in a few other robust methods, such as uncertainty decoding or detection-based recognition.

These so called missing data masks play a fundamental role in missing data recognition, and their quality has a strong impact on the final recognition accuracy.

Although in theory a mask can be defined for any parameterization domain, in practice, such a domain should map distinct frequency bands onto different feature space dimensions, so that a frequency-limited noise only affects a few dimensions of the feature space. This is typically true for frequency-like and wavelet-based parameterizations and for most auditory-inspired analysis front-ends. This is not the case in the cepstrum.

Several missing data studies have shown that soft masks give better results than hard masks.

When the mask is computed based on the local SnR given both the clean and noisy speech signals, the resulting mask is called an oracle mask.

Two common techniques for handling missing features during recognition: data imputation and data marginalization.

Training stochastic models of masks seem a promising approach to infer efficient masks.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: