2020
DOI: 10.3390/e22020183
|View full text |Cite
|
Sign up to set email alerts
|

Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD

Abstract: A robust approach for the application of audio content classification (ACC) is proposed in this paper, especially in variable noise-level conditions. We know that speech, music, and background noise (also called silence) are usually mixed in the noisy audio signal. Based on the findings, we propose a hierarchical ACC approach consisting of three parts: voice activity detection (VAD), speech/music discrimination (SMD), and post-processing. First, entropy-based VAD is successfully used to segment input signal in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 68 publications
0
4
0
Order By: Relevance
“…Nanni suggested a general architecture for context-aware recommendation system [ 17 ]. Wang suggested introducing wavelet transform into feature engineering and using neural network model in classification method [ 18 ]. López proposed a new language model and suggested the usefulness of clustering using tags and audio content.…”
Section: Related Workmentioning
confidence: 99%
“…Nanni suggested a general architecture for context-aware recommendation system [ 17 ]. Wang suggested introducing wavelet transform into feature engineering and using neural network model in classification method [ 18 ]. López proposed a new language model and suggested the usefulness of clustering using tags and audio content.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, segmentation improves classification. A general audio classification scheme to segment an arbitrary audio clip is presented in [1]. It achieves good accuracy rate of 96%.…”
Section: Literature Surveymentioning
confidence: 99%
“…Hence, are considered for experimentation. Research by Wang [1], music is separated in three categories. First is popular music domain.…”
Section: Literature Surveymentioning
confidence: 99%
“…Voice activity detection (VAD) is a technique for detecting the presence of speech signal in speech data [22]. It has been widely used to enhance the speech contents such as speech classification [23], speaker recognition [24], and speech enhancement [25,26]. Figure 4 shows three processing steps for VAD: (1) noise reduction, (2) segmentation, and (3) elimination [27].…”
Section: Voice Activity Detectionmentioning
confidence: 99%