In order to restrict the range of methods, only the techniques using a single microphone are considered.
A similar principle is used in a few other robust methods, such as uncertainty decoding or detection-based recognition.
These so called missing data masks play a fundamental role in missing data recognition, and their quality has a strong impact on the final recognition accuracy.
Although in theory a mask can be defined for any parameterization domain, in practice, such a domain should map distinct frequency bands onto different feature space dimensions, so that a frequency-limited noise only affects a few dimensions of the feature space. This is typically true for frequency-like and wavelet-based parameterizations and for most auditory-inspired analysis front-ends. This is not the case in the cepstrum.
Several missing data studies have shown that soft masks give better results than hard masks.
When the mask is computed based on the local SnR given both the clean and noisy speech signals, the resulting mask is called an oracle mask.
Two common techniques for handling missing features during recognition: data imputation and data marginalization.
Training stochastic models of masks seem a promising approach to infer efficient masks.