“…Additionally, advances in technologies such as hearing aids require the speech systems to enhance perceptual quality of speech captured in adverse environmental conditions, thus improving human hearing abilities. Several deep learning (DL)-based speech enhancement systems have been successfully developed to address concurrent improvements in perceptual quality and performance of back-end speech and language applications using fully convolutional neural networks (FCN), and recurrent networks (RNN) [9,10,11,12]. The majority of these approaches work with the complex short-term fourier transform (STFT) of distorted speech, either to enhance the log-power spectrum (LPS) and reuse the unaltered distorted phase signal [13,14,15,16,17], or to estimate the complex ratio mask (cRM) [18,19,20] and directly enhance the complex spectrogram to restore a cleaner time-domain signal.…”