2009 IEEE International Conference on Acoustics, Speech and Signal Processing 2009
DOI: 10.1109/icassp.2009.4960554
|View full text |Cite
|
Sign up to set email alerts
|

Audio segmentation for speech recognition using segment features

Abstract: Audio segmentation is an essential preprocessing step in several audio processing applications with a significant impact e.g. on speech recognition performance. We introduce a novel framework which combines the advantages of different well known segmentation methods. An automatically estimated log-linear segment model is used to determine the segmentation of an audio stream in a holistic way by a maximum a posteriori decoding strategy, instead of classifying change points locally. A comparison to other segment… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
21
0

Year Published

2010
2010
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 41 publications
(25 citation statements)
references
References 7 publications
0
21
0
Order By: Relevance
“…Change point detection methods are applied here for audio segmentation and recognizing boundaries between silence, sentences, words, and noise [13][14]. …”
Section: Introductionmentioning
confidence: 99%
“…Change point detection methods are applied here for audio segmentation and recognizing boundaries between silence, sentences, words, and noise [13][14]. …”
Section: Introductionmentioning
confidence: 99%
“…We search through all possible locations and predict the one with the highest score. In this example the score is calculated for timing sequence is (1,4,6).…”
Section: Model Descriptionmentioning
confidence: 99%
“…Phoneme Boundary Detection or Phoneme Segmentation plays an essential first step for a variety of speech processing applications such as speaker diarization [1], speech science [2,3], keyword spotting [4], Automatic Speech Recognition [5,6], etc.…”
Section: Introductionmentioning
confidence: 99%
“…Although audio analysis has been widely studied in scene classification [8,9,10], audio segmentation [11,12,13], and audio retrieval [14,15,16], to our knowledge, automatic audio tagging has not been much explored. Bertin-Mahieux et al [17] treated audio tag prediction as a set of binary classification problems and applied the Adaboost algorithm to the task.…”
Section: Introductionmentioning
confidence: 99%