Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion

Butko, Taras; Nadeu, Climent

doi:10.1186/1687-4722-2011-1

Cited by 30 publications

(31 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Using this database an audio segmentation task was proposed, where the systems were required to identify the presence of speech, music and/or noise, either isolated or overlapped. The Albayzín-2014 Audio Segmentation Evaluation contributed to the evolution of the audio segmentation technology in broadcast news domains by providing a more general and realistic database, compared to those used in the Albayzín-2010 and -2012 Audio Segmentation Evaluations [10,30]. The main features of the approaches and the results attained by seven segmentation systems from four different research groups have been presented and briefly analyzed.…”

Section: Discussionmentioning

confidence: 99%

“…However, some classes are better described by the statistics computed over longer periods of time (from 0.5 to 5 s long). These characteristics are referred in the literature as segment-based features [29,30]. For example, in [31], a content-based speech discrimination algorithm is designed to exploit the long-term information inherent in the modulation spectrum; and in [32], authors propose two segment-based features: the variance of the spectrum flux (VSF) and the variance of the zero crossing rate (VZCR).…”

Section: General Description Of Audio Segmentation Systemsmentioning

confidence: 99%

See 1 more Smart Citation

Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains

Castán

Tavarez

López-Otero

et al. 2015

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

Audio segmentation is important as a pre-processing task to improve the performance of many speech technology tasks and, therefore, it has an undoubted research interest. This paper describes the database, the metric, the systems and the results for the Albayzín-2014 audio segmentation campaign. In contrast to previous evaluations where the task was the segmentation of non-overlapping classes, Albayzín-2014 evaluation proposes the delimitation of the presence of speech, music and/or noise that can be found simultaneously. The database used in the evaluation was created by fusing different media and noises in order to increase the difficulty of the task. Seven segmentation systems from four different research groups were evaluated and combined. Their experimental results were analyzed and compared with the aim of providing a benchmark and showing up the promising directions in this field.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: General Description Of Audio Segmentation Systemsmentioning

confidence: 99%

Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains

Castán

Tavarez

López-Otero

et al. 2015

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

show abstract

“…However, there is an important amount of long segments (longer than 60 s). More details about the database and the labeling process can be found in [19].…”

Section: Databasementioning

confidence: 99%

“…A complete description of the Albayzin 2010 audio segmentation and classification evaluation can be found in [19] where the participant's approaches and the results are presented. We describe the database and the metric used in the evaluation in the next subsections.…”

Section: Albayzin Audio Segmentation Evaluations and Database Descripmentioning

confidence: 99%

“…However, some classes are better described by the statistics computed over consecutive frames (from 0.5 to 5 s long). These characteristics are referred in the literature as segment-based features [18,19]. For example, in [20], a content-based speech discrimination algorithm is designed to exploit the long-term information inherent in the modulation spectrum.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Audio segmentation-by-classification approach based on factor analysis in broadcast news domain

Castán

Giménez

Miguel

et al. 2014

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

This paper studies a novel audio segmentation-by-classification approach based on factor analysis. The proposed technique compensates the within-class variability by using class-dependent factor loading matrices and obtains the scores by computing the log-likelihood ratio for the class model to a non-class model over fixed-length windows. Afterwards, these scores are smoothed to yield longer contiguous segments of the same class by means of different back-end systems. Unlike previous solutions, our proposal does not make use of specific acoustic features and does not need a hierarchical structure. The proposed method is applied to segment and classify audios coming from TV shows into five different acoustic classes: speech, music, speech with music, speech with noise, and others. The technique is compared to a hierarchical system with specific acoustic features achieving a significant error reduction.

show abstract

Speech Music Overlap Detection Using Spectral Peak Evolutions

Bhattacharjee

Prasanna

Guha

2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion

Cited by 30 publications

References 12 publications

Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains

Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains

Audio segmentation-by-classification approach based on factor analysis in broadcast news domain

Speech Music Overlap Detection Using Spectral Peak Evolutions

Contact Info

Product

Resources

About