2011
DOI: 10.1016/j.dsp.2010.07.003
|View full text |Cite
|
Sign up to set email alerts
|

A two level strategy for audio segmentation

Abstract: International audienceIn this paper we are dealing with audio segmentation. The audio tracks are sampled in short sequences which are classified into several classes. Every sequence can then be further analyzed depending on the class it belongs to. We first describe simple techniques for segmentation in two or three classes. These methods rely on amplitude, spectral or cepstral analysis, and classical Hidden Markov Models. From the limitations of these approaches, we propose a two level segmentation process. T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0
1

Year Published

2011
2011
2022
2022

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(10 citation statements)
references
References 15 publications
0
9
0
1
Order By: Relevance
“…This task occurs in many disciplines, for example in finding repeated animal behavior (Mueen, Keogh, Zhu, Cash, & Westover, 2013), finding regulatory elements in DNA (Das & Dai, 2007), and finding patterns in EEG signals (Castro & Azevedo, 2010). Another application area of our software is segmentation and clustering of audio datasets (Siegler, Jain, Raj, & Stern, 1997, Lefèvre & Vincent (2011, Kamper, Livescu, & Goldwater (2017)).…”
Section: Application Areasmentioning
confidence: 99%
“…This task occurs in many disciplines, for example in finding repeated animal behavior (Mueen, Keogh, Zhu, Cash, & Westover, 2013), finding regulatory elements in DNA (Das & Dai, 2007), and finding patterns in EEG signals (Castro & Azevedo, 2010). Another application area of our software is segmentation and clustering of audio datasets (Siegler, Jain, Raj, & Stern, 1997, Lefèvre & Vincent (2011, Kamper, Livescu, & Goldwater (2017)).…”
Section: Application Areasmentioning
confidence: 99%
“…Initial classification was performed by using the k-means classifier and segment-related features. Final classification was done by using the Multidimensional Hidden Markov Models and frame-related features [11].…”
Section: Related Workmentioning
confidence: 99%
“…The latter evaluation showed that the overlapping segments accounted for more than 70% of errors produced by every submitted system. Despite the interest shown in mixed sound detection in the recent years [11][12][13], it still remains a challenging problem.…”
Section: Introductionmentioning
confidence: 99%
“…The most common procedure for doing that is by Viterbi decoding, i.e., using a dynamic programming algorithm to find in a recursive a manner the most probable sequence of HMM states. The HMM-based audio segmentation approach borrowed from speech/ speaker recognition applications has been successfully applied in [4,10,13] and many other studies.…”
Section: Introductionmentioning
confidence: 99%