“…2 shows that the proposed approach improves the perceived speech quality of the separated signals compared to the competitive approaches. • STOI [41]: STOI is used in order to predict speech intelligibility in noisy conditions [11,42,44]. In fact, STOI is shown to have a good correlation coefficient with speech intelligibility (equal to 0.79), suggesting that it is a useful predictor of algorithm performance [46].…”
Section: Experiments 2: Evaluation Of the Whole Separation Systemmentioning
confidence: 99%
“…Also, it predicts sentence recognition [47] with clean phase, 2) conventional NMF [47], 3) NMF [47] with noisy phase, 4) the phase-aware approach in [34], and 5) the proposed approach [4], and 4) the proposed approach in noisy and challenging acoustic conditions. In addition, NISQI is proven to have the highest correlation score with subjective tests [11]. As they do not require the clean signal to assess the quality, NISQI and SII are appropriate to assess real-time separation performance.…”
Section: Experiments 3: Evaluation Of Real-life Recordingsmentioning
confidence: 99%
“…Fuzzy expert systems (FESs) have been proved to be very successful in formalising the practical rules by the design of fuzzy sets to approximate human reasoning [11,53,54]. In this work, a FES based on CLIPS [Developed by NASA, distributed and supported by COSMIC.]…”
Section: Experiments 5: Evaluation Of Emotional Datamentioning
confidence: 99%
“…In fact, the tested signal should appear in the trained signals otherwise it cannot be addressed. Moreover, training makes the approach more time consuming which makes these methods difficult to use in real‐time applications [11, 12].…”
Single channel speech separation (SCSS) is often required as post-processing in several applications that facilitate automatic human-to-human or human-to-machine communication in challenging acoustic environments such as voice command for smart homes or robotics. The proposed SCSS system, that the authors call phase-aware subspace decomposition (PASD), relies on subspace decomposition for speech separation followed by a phase-aware mask for final subspace recovery. In fact, the proposed approach decomposes the mixture into a sparse and low-rank subspace in the frequency domain by rank minimising that relies on iterative decomposition using adaptive thresholding in each iteration to achieve soft estimation and considers phase-information for reconstruction. Separation results are reported in terms of both intrusive and non-intrusive metrics using realistic recordings corrupted with real-life noises. As speech separation systems are expected to have maximal interference rejection without speech distortion, we also evaluate the proposed system by recognising speech from a target speaker in the presence of either concurrent speech or noise. Recognition results show that separated signals are of high intelligibility so that they can be exploited by other automatic applications. The extensive evaluation under different test scenarios proves that PASD consistently improves the quality of the separated signals, compared to other benchmark approaches.
“…2 shows that the proposed approach improves the perceived speech quality of the separated signals compared to the competitive approaches. • STOI [41]: STOI is used in order to predict speech intelligibility in noisy conditions [11,42,44]. In fact, STOI is shown to have a good correlation coefficient with speech intelligibility (equal to 0.79), suggesting that it is a useful predictor of algorithm performance [46].…”
Section: Experiments 2: Evaluation Of the Whole Separation Systemmentioning
confidence: 99%
“…Also, it predicts sentence recognition [47] with clean phase, 2) conventional NMF [47], 3) NMF [47] with noisy phase, 4) the phase-aware approach in [34], and 5) the proposed approach [4], and 4) the proposed approach in noisy and challenging acoustic conditions. In addition, NISQI is proven to have the highest correlation score with subjective tests [11]. As they do not require the clean signal to assess the quality, NISQI and SII are appropriate to assess real-time separation performance.…”
Section: Experiments 3: Evaluation Of Real-life Recordingsmentioning
confidence: 99%
“…Fuzzy expert systems (FESs) have been proved to be very successful in formalising the practical rules by the design of fuzzy sets to approximate human reasoning [11,53,54]. In this work, a FES based on CLIPS [Developed by NASA, distributed and supported by COSMIC.]…”
Section: Experiments 5: Evaluation Of Emotional Datamentioning
confidence: 99%
“…In fact, the tested signal should appear in the trained signals otherwise it cannot be addressed. Moreover, training makes the approach more time consuming which makes these methods difficult to use in real‐time applications [11, 12].…”
Single channel speech separation (SCSS) is often required as post-processing in several applications that facilitate automatic human-to-human or human-to-machine communication in challenging acoustic environments such as voice command for smart homes or robotics. The proposed SCSS system, that the authors call phase-aware subspace decomposition (PASD), relies on subspace decomposition for speech separation followed by a phase-aware mask for final subspace recovery. In fact, the proposed approach decomposes the mixture into a sparse and low-rank subspace in the frequency domain by rank minimising that relies on iterative decomposition using adaptive thresholding in each iteration to achieve soft estimation and considers phase-information for reconstruction. Separation results are reported in terms of both intrusive and non-intrusive metrics using realistic recordings corrupted with real-life noises. As speech separation systems are expected to have maximal interference rejection without speech distortion, we also evaluate the proposed system by recognising speech from a target speaker in the presence of either concurrent speech or noise. Recognition results show that separated signals are of high intelligibility so that they can be exploited by other automatic applications. The extensive evaluation under different test scenarios proves that PASD consistently improves the quality of the separated signals, compared to other benchmark approaches.
“…It is anticipated to work over frames in all critical bands utilizing the threshold noise variance. Belhedi et al [62] used soft mask as a core in the proposed approach. The method produces two separate signals of dissimilar qualities and made them available in two separate channels.…”
Section: Signal Subspace-based Speech Enhancement Algorithmsmentioning
Many forms of human communication exist; for instance, text and nonverbal based. Speech is, however, the most powerful and dexterous form for the humans. Speech signals enable humans to communicate and this usefulness of the speech signals has led to a variety of speech processing applications. Successful use of these applications is, however, significantly aggravated in presence of the background noise distortions. These noise signals overlap and mask the target speech signals. To deal with these overlapping background noise distortions, a speech enhancement algorithm at front end is crucial in order to make noisy speech intelligible and pleasant. Speech enhancement has become a very important research and engineering problem for the last couple of decades. In this paper, we present an all-inclusive survey on unsupervised single-channel speech enhancement (U-SCSE) algorithms. A taxonomy based review of the U-SCSE algorithms is presented and the associated studies regarding improving the intelligibility and quality are outlined. The studies on the speech enhancement algorithms in unsupervised perspective are presented. Objective experiments have been performed to evaluate the potential of the U-SCSE algorithms in terms of improving the speech intelligibility and quality. It is found that unsupervised speech enhancement improves the speech quality but the speech intelligibility improvement is deprived. To finish, several research problems are identified that require further research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.