“…Let us further denote by Θ(m, k) the underlying atomic audio source. Then, under the previous assumptions, the noisy speech signal at timefrequency point (m, k) can be modeled as: We recognize here the weak-sparseness model [24] applied to speech processing, in the continuation of [18]. In summary, our model essentially assumes that the STFT of noisy speech signals satisfies the following three key properties in each time-frequency bin (m, k): (A'1): the presence/absence of speech ε(m, k) and the atomic speech audio source Θ(m, k) are independent, (A'2): the speech-presence probability does not exceed 1/2, (A'3): the instantaneous power of the random clean speech signal is upper-bounded by a finite value.…”