Bootstrap Averaging for Model-Based Source Separation in Reverberant Conditions

Chandna, Swati; Wang, Wenwu

doi:10.1109/taslp.2018.2797425

Cited by 3 publications

(3 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Given a Gaussian-Mixture model, the goal is to maximize the likelihood function with respect to the parameters. An elegant powerful method for finding the maximum likelihood solution for models with latent variables is called the Expectation-Maximization algorithm, or EM algorithm [ 7 ]. Some initial magnitudes for means

, covariances

, and mixing coefficients

are selected by us.…”

Section: Our Proposed Methods Based On Scene Recognition and Semantmentioning

confidence: 99%

“…Unhealthy sitting posture not only increases the risk of occupational musculoskeletal disease, i.e., lumbar disease and cervical disease but it is closely related to the incidence of myopia. According to the study by the National Institute for Occupational Safety and Health (NIOSH) on musculoskeletal disease and occupational factors, unhealthy sitting postures caused by incorrect postures of the trunk and neck are closely related to human skeletal diseases [ 6 , 7 ]. Lis et al [ 8 ] found that working in an unhealthy sitting posture for more than five hours would increase the probability of contracting backache and sciatica.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Scene Recognition and Semantic Analysis Approach to Unhealthy Sitting Posture Detection during Screen-Reading

Min

Cui

Han

et al. 2018

Sensors

View full text Add to dashboard Cite

Behavior analysis through posture recognition is an essential research in robotic systems. Sitting with unhealthy sitting posture for a long time seriously harms human health and may even lead to lumbar disease, cervical disease and myopia. Automatic vision-based detection of unhealthy sitting posture, as an example of posture detection in robotic systems, has become a hot research topic. However, the existing methods only focus on extracting features of human themselves and lack understanding relevancies among objects in the scene, and henceforth fail to recognize some types of unhealthy sitting postures in complicated environments. To alleviate these problems, a scene recognition and semantic analysis approach to unhealthy sitting posture detection in screen-reading is proposed in this paper. The key skeletal points of human body are detected and tracked with a Microsoft Kinect sensor. Meanwhile, a deep learning method, Faster R-CNN, is used in the scene recognition of our method to accurately detect objects and extract relevant features. Then our method performs semantic analysis through Gaussian-Mixture behavioral clustering for scene understanding. The relevant features in the scene and the skeletal features extracted from human are fused into the semantic features to discriminate various types of sitting postures. Experimental results demonstrated that our method accurately and effectively detected various types of unhealthy sitting postures in screen-reading and avoided error detection in complicated environments. Compared with the existing methods, our proposed method detected more types of unhealthy sitting postures including those that the existing methods could not detect. Our method can be potentially applied and integrated as a medical assistance in robotic systems of health care and treatment.

show abstract

, covariances

, and mixing coefficients

are selected by us.…”

Section: Our Proposed Methods Based On Scene Recognition and Semantmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

A Scene Recognition and Semantic Analysis Approach to Unhealthy Sitting Posture Detection during Screen-Reading

Min

Cui

Han

et al. 2018

Sensors

View full text Add to dashboard Cite

show abstract

“…Considering the fact that the prior information of speech and noise can improve speech quality, our former works [26,27] have shown an effectiveness of using binaural inter-channel cues between speech and noise to enhance speech. In previous studies based on the cue parameter [28][29][30][31][32][33][34][35][36][37][38][39], the binaural inter-channel cues [28][29][30][31][32][33][34][35][36][37] have been used to estimate ideal T-F mask in binaural computational auditory scene analysis (CASA) systems and have shown a good performance in binaural speech processing. In the BCC technique [40][41][42], the binaural inter-channel cues were viewed as the side information, which was combined with a down-mixed audio signal to recover the left channel and right channel audio signals.…”

Section: Introductionmentioning

confidence: 99%

Speech enhancement methods based on binaural cue coding

Wang

Bao

2019

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

According to the encoding and decoding mechanism of binaural cue coding (BCC), in this paper, the speech and noise are considered as left channel signal and right channel signal of the BCC framework, respectively. Subsequently, the speech signal is estimated from noisy speech when the inter-channel level difference (ICLD) and inter-channel correlation (ICC) between speech and noise are given. In this paper, exact inter-channel cues and the pre-enhanced inter-channel cues are used for speech restoration. The exact inter-channel cues are extracted from clean speech and noise, and the pre-enhanced inter-channel cues are extracted from the pre-enhanced speech and estimated noise. After that, they are combined one by one to form a codebook. Once the pre-enhanced cues are extracted from noisy speech, the exact cues are estimated by a mapping between the pre-enhanced cues and a prior codebook. Next, the estimated exact cues are used to obtain a time-frequency (T-F) mask for enhancing noisy speech based on the decoding of BCC. In addition, in order to further improve accuracy of the T-F mask based on the inter-channel cues, the deep neural network (DNN)-based method is proposed to learn the mapping relationship between input features of noisy speech and the T-F masks. Experimental results show that the codebook-driven method can achieve better performance than conventional methods, and the DNN-based method performs better than the codebook-driven method.

show abstract

SitR: Sitting Posture Recognition Using RF Signals

Lin

Liu

et al. 2020

IEEE Internet Things J.

View full text Add to dashboard Cite

Bootstrap Averaging for Model-Based Source Separation in Reverberant Conditions

Cited by 3 publications

References 37 publications

A Scene Recognition and Semantic Analysis Approach to Unhealthy Sitting Posture Detection during Screen-Reading

A Scene Recognition and Semantic Analysis Approach to Unhealthy Sitting Posture Detection during Screen-Reading

Speech enhancement methods based on binaural cue coding

SitR: Sitting Posture Recognition Using RF Signals

Contact Info

Product

Resources

About