Unsupervised single channel speech separation based on optimized subspace separation

Wiem, Belhedi; Anouar, Ben Messaoud Mohamed; Mowlaee, Pejman; Bouzid, Aïcha

doi:10.1016/j.specom.2017.11.010

Cited by 11 publications

(10 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2 shows that the proposed approach improves the perceived speech quality of the separated signals compared to the competitive approaches. • STOI [41]: STOI is used in order to predict speech intelligibility in noisy conditions [11,42,44]. In fact, STOI is shown to have a good correlation coefficient with speech intelligibility (equal to 0.79), suggesting that it is a useful predictor of algorithm performance [46].…”

Section: Experiments 2: Evaluation Of the Whole Separation Systemmentioning

confidence: 99%

“…Also, it predicts sentence recognition [47] with clean phase, 2) conventional NMF [47], 3) NMF [47] with noisy phase, 4) the phase-aware approach in [34], and 5) the proposed approach [4], and 4) the proposed approach in noisy and challenging acoustic conditions. In addition, NISQI is proven to have the highest correlation score with subjective tests [11]. As they do not require the clean signal to assess the quality, NISQI and SII are appropriate to assess real-time separation performance.…”

Section: Experiments 3: Evaluation Of Real-life Recordingsmentioning

confidence: 99%

“…Fuzzy expert systems (FESs) have been proved to be very successful in formalising the practical rules by the design of fuzzy sets to approximate human reasoning [11,53,54]. In this work, a FES based on CLIPS [Developed by NASA, distributed and supported by COSMIC.]…”

Section: Experiments 5: Evaluation Of Emotional Datamentioning

confidence: 99%

“…In fact, the tested signal should appear in the trained signals otherwise it cannot be addressed. Moreover, training makes the approach more time consuming which makes these methods difficult to use in real‐time applications [11, 12].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Phase‐aware subspace decomposition for single channel speech separation

Wiem

Anouar

Bouzid

2020

IET signal process.

Self Cite

View full text Add to dashboard Cite

Single channel speech separation (SCSS) is often required as post-processing in several applications that facilitate automatic human-to-human or human-to-machine communication in challenging acoustic environments such as voice command for smart homes or robotics. The proposed SCSS system, that the authors call phase-aware subspace decomposition (PASD), relies on subspace decomposition for speech separation followed by a phase-aware mask for final subspace recovery. In fact, the proposed approach decomposes the mixture into a sparse and low-rank subspace in the frequency domain by rank minimising that relies on iterative decomposition using adaptive thresholding in each iteration to achieve soft estimation and considers phase-information for reconstruction. Separation results are reported in terms of both intrusive and non-intrusive metrics using realistic recordings corrupted with real-life noises. As speech separation systems are expected to have maximal interference rejection without speech distortion, we also evaluate the proposed system by recognising speech from a target speaker in the presence of either concurrent speech or noise. Recognition results show that separated signals are of high intelligibility so that they can be exploited by other automatic applications. The extensive evaluation under different test scenarios proves that PASD consistently improves the quality of the separated signals, compared to other benchmark approaches.

show abstract

Section: Experiments 2: Evaluation Of the Whole Separation Systemmentioning

confidence: 99%

Section: Experiments 3: Evaluation Of Real-life Recordingsmentioning

confidence: 99%

Section: Experiments 5: Evaluation Of Emotional Datamentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Phase‐aware subspace decomposition for single channel speech separation

Wiem

Anouar

Bouzid

2020

IET signal process.

Self Cite

View full text Add to dashboard Cite

show abstract

“…It is anticipated to work over frames in all critical bands utilizing the threshold noise variance. Belhedi et al [62] used soft mask as a core in the proposed approach. The method produces two separate signals of dissimilar qualities and made them available in two separate channels.…”

Section: Signal Subspace-based Speech Enhancement Algorithmsmentioning

confidence: 99%

On Improvement of Speech Intelligibility and Quality: A Survey of Unsupervised Single Channel Speech Enhancement Algorithms

Verdú¹,

Saleem²,

Khattak³

2020

IJIMAI

View full text Add to dashboard Cite

Many forms of human communication exist; for instance, text and nonverbal based. Speech is, however, the most powerful and dexterous form for the humans. Speech signals enable humans to communicate and this usefulness of the speech signals has led to a variety of speech processing applications. Successful use of these applications is, however, significantly aggravated in presence of the background noise distortions. These noise signals overlap and mask the target speech signals. To deal with these overlapping background noise distortions, a speech enhancement algorithm at front end is crucial in order to make noisy speech intelligible and pleasant. Speech enhancement has become a very important research and engineering problem for the last couple of decades. In this paper, we present an all-inclusive survey on unsupervised single-channel speech enhancement (U-SCSE) algorithms. A taxonomy based review of the U-SCSE algorithms is presented and the associated studies regarding improving the intelligibility and quality are outlined. The studies on the speech enhancement algorithms in unsupervised perspective are presented. Objective experiments have been performed to evaluate the potential of the U-SCSE algorithms in terms of improving the speech intelligibility and quality. It is found that unsupervised speech enhancement improves the speech quality but the speech intelligibility improvement is deprived. To finish, several research problems are identified that require further research.

show abstract