Speaker diarization of broadcast news in Albayzin 2010 evaluation campaign

Zelenák, Martin; Schulz, Henrik; Hernando, Javier

doi:10.1186/1687-4722-2012-19

Cited by 29 publications

(23 citation statements)

References 15 publications

(17 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These evaluation campaigns provide an objective mechanism to compare different systems and are a powerful way to promote research on different speech technologies [56][57][58][59][60][61][62][63].…”

Section: Motivation and Organization Of This Papermentioning

confidence: 99%

ALBAYZIN 2016 spoken term detection evaluation: an international open competitive evaluation in Spanish

Tejedor

Toledano

López-Otero

et al. 2017

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

Within search-on-speech, Spoken Term Detection (STD) aims to retrieve data from a speech repository given a textual representation of a search term. This paper presents an international open evaluation for search-on-speech based on STD in Spanish and an analysis of the results. The evaluation has been designed carefully so that several analyses of the main results can be carried out. The evaluation consists in retrieving the speech files that contain the search terms, providing their start and end times, and a score value that reflects the confidence given to the detection. Two different Spanish speech databases have been employed in the evaluation: MAVIR database, which comprises a set of talks from workshops, and EPIC database, which comprises a set of European Parliament sessions in Spanish. We present the evaluation itself, both databases, the evaluation metric, the systems submitted to the evaluation, the results, and a detailed discussion. Five different research groups took part in the evaluation, and ten different systems were submitted in total. We compare the systems submitted to the evaluation and make a deep analysis based on some search term properties (term length, within-vocabulary/out-of-vocabulary terms, single-word/ multi-word terms, and native (Spanish)/foreign terms).

show abstract

Section: Motivation and Organization Of This Papermentioning

confidence: 99%

ALBAYZIN 2016 spoken term detection evaluation: an international open competitive evaluation in Spanish

Tejedor

Toledano

López-Otero

et al. 2017

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

show abstract

“…We created an open recipe to enable participants to build a baseline ASR system using the Kaldi toolkit [17], as well as XMLStar- 3 , and the SRILM 4 and IRSTLM 5 toolkits. This baseline system simplified and automated the data pre-processing tasks, thus allowing participants to focus on more advanced aspects of ASR modelbuilding.…”

Section: Baseline Systemmentioning

confidence: 99%

“…There have been evaluations of, and corpora for, the rich transcription and diarization of broadcast speech since the mid-1990s [1,2,3,4,5], but all have been limited domain -typically broadcast news. The MediaEval evaluation of multimodal search and hyperlinking [6] used, but did not evaluate, automatic transcriptions of multi-genre broadcast data (in fact the same acoustic data used in the MGB challenge).…”

Section: Introductionmentioning

confidence: 99%

The MGB challenge: Evaluating multi-genre broadcast media recognition

Bell

Gales

Hain

et al. 2015

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

125

151

View full text Add to dashboard Cite

This paper describes the Multi-Genre Broadcast (MGB) Challenge at ASRU 2015, an evaluation focused on speech recognition, speaker diarization, and "lightly supervised" alignment of BBC TV recordings. The challenge training data covered the whole range of seven weeks BBC TV output across four channels, resulting in about 1,600 hours of broadcast audio. In addition several hundred million words of BBC subtitle text was provided for language modelling. A novel aspect of the evaluation was the exploration of speech recognition and speaker diarization in a longitudinal setting -i.e. recognition of several episodes of the same show, and speaker diarization across these episodes, linking speakers. The longitudinal tasks also offered the opportunity for systems to make use of supplied metadata including show title, genre tag, and date/time of transmission. This paper describes the task data and evaluation process used in the MGB challenge, and summarises the results obtained.

show abstract

“…This campaign is an internationally open set of evaluations supported by the Spanish Network of Speech Technologies (RTTH [32]) and the ISCA Special Interest Group on Iberian Languages (SIG-IL [33]), which have been held every 2 years since 2006. The evaluation campaigns provide an objective mechanism to compare different systems and are a powerful way to promote research on different speech technologies (e.g., speech segmentation [34], speaker diarization [35], language recognition [36], query-by-example spoken term detection [37], and speech synthesis [38] in the ALBAYZIN 2010 and 2012 evaluation campaigns). This year, this campaign has been held during the IberSPEECH 2014 conference [39].…”

Section: Introductionmentioning

confidence: 99%

Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion

Tejedor

Toledano

López-Otero

et al. 2015

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

Spoken term detection (STD) aims at retrieving data from a speech repository given a textual representation of the search term. Nowadays, it is receiving much interest due to the large volume of multimedia information. STD differs from automatic speech recognition (ASR) in that ASR is interested in all the terms/words that appear in the speech data, whereas STD focuses on a selected list of search terms that must be detected within the speech data. This paper presents the systems submitted to the STD ALBAYZIN 2014 evaluation, held as a part of the ALBAYZIN 2014 evaluation campaign within the context of the IberSPEECH 2014 conference. This is the first STD evaluation that deals with Spanish language. The evaluation consists of retrieving the speech files that contain the search terms, indicating their start and end times within the appropriate speech file, along with a score value that reflects the confidence given to the detection of the search term. The evaluation is conducted on a Spanish spontaneous speech database, which comprises a set of talks from workshops and amounts to about 7 h of speech. We present the database, the evaluation metrics, the systems submitted to the evaluation, the results, and a detailed discussion. Four different research groups took part in the evaluation. Evaluation results show reasonable performance for moderate out-of-vocabulary term rate. This paper compares the systems submitted to the evaluation and makes a deep analysis based on some search term properties (term length, in-vocabulary/out-of-vocabulary terms, single-word/multi-word terms, and in-language/foreign terms).

show abstract

Speaker diarization of broadcast news in Albayzin 2010 evaluation campaign

Cited by 29 publications

References 15 publications

ALBAYZIN 2016 spoken term detection evaluation: an international open competitive evaluation in Spanish

ALBAYZIN 2016 spoken term detection evaluation: an international open competitive evaluation in Spanish

The MGB challenge: Evaluating multi-genre broadcast media recognition

Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion

Contact Info

Product

Resources

About