The MGB challenge: Evaluating multi-genre broadcast media recognition

Bell, P. J.; Gales, Mark J. F.; Hain, Thomas; Kilgour, Jonathan; Lanchantin, Pierre; Liu, X; McParland, Andrew; Renals, Steve; Saz, Óscar; Wester, Mirjam; Woodland, Philip C.

doi:10.1109/asru.2015.7404863

Cited by 125 publications

(158 citation statements)

References 22 publications

(17 reference statements)

Supporting

Mentioning

151

Contrasting

Order By: Relevance

“…The ASR system we chose was evaluated using a large multi-genre television dataset ( Bell et al, 2015 ). It had an overall word error rate of 47%, however for news content, which is clearly spoken by a native speaker, this dropped to 16%.…”

Section: Methodsmentioning

confidence: 99%

A Contextual Study of Semantic Speech Editing in Radio Production

Baume

Plumbley

Calic³

et al. 2018

International Journal of Human-Computer Studies

View full text Add to dashboard Cite

a b s t r a c t Radio production involves editing speech-based audio using tools that represent sound using simple waveforms. Semantic speech editing systems allow users to edit audio using an automatically generated transcript, which has the potential to improve the production workflow. To investigate this, we developed a semantic audio editor based on a pilot study. Through a contextual qualitative study of five professional radio producers at the BBC, we examined the existing radio production process and evaluated our semantic editor by using it to create programmes that were later broadcast.We observed that the participants in our study wrote detailed notes about their recordings and used annotation to mark which parts they wanted to use. They collaborated closely with the presenter of their programme to structure the contents and write narrative elements. Participants reported that they often work away from the office to avoid distractions, and print transcripts so they can work away from screens. They also emphasised that listening is an important part of production, to ensure high sound quality. We found that semantic speech editing with automated speech recognition can be used to improve the radio production workflow, but that annotation, collaboration, portability and listening were not well supported by current semantic speech editing systems. In this paper, we make recommendations on how future semantic speech editing systems can better support the requirements of radio production.

show abstract

Section: Methodsmentioning

confidence: 99%

A Contextual Study of Semantic Speech Editing in Radio Production

Baume

Plumbley

Calic³

et al. 2018

International Journal of Human-Computer Studies

View full text Add to dashboard Cite

show abstract

“…Research in this field has made great progress thanks to real speech corpora collected for various application scenarios such as voice command for cars (Hansen et al, 2001), smart homes (Ravanelli et al, 2015), or tablets (Barker et al, 2015), and automatic transcription of lectures (Lamel et al, 1994), meetings (Renals et al, 2008), conversations (Harper, 2015), dialogues (Stupakov et al, 2011), game sessions (Fox et al, 2013), or broadcast media (Bell et al, 2015). In most corpora, the training speakers differ from the test speakers.…”

Section: Introductionmentioning

confidence: 99%

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

Vincent

Watanabe

Nugraha

et al. 2017

Computer Speech & Language

290

177

View full text Add to dashboard Cite

“…To train and evaluate the effectiveness of our proposed approach, we conducted experiments on a recent and very challenging dataset from the Multi-Genre Broadcast (MGB) Challenge [18]. The MGB data is a large broad and multigenre, spanning the whole range of TV output.…”

Section: Experimental Setup 41 Data and Asr Systemmentioning

confidence: 99%

Automatic speech recognition errors detection using supervised learning techniques

Errattahi

Ouahmane

Hain

2016

2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)

View full text Add to dashboard Cite

Abstract-Over the last years, many advances have been made in the field of Automatic Speech Recognition (ASR). However, the persistent presence of ASR errors is limiting the widespread adoption of speech technology in real life applications. This motivates the attempts to find alternative techniques to automatically detect and correct ASR errors, which can be very effective and especially when the user does not have access to tune the features, the models or the decoder of the ASR system or when the transcription serves as input to downstream systems like machine translation, information retrieval, and question answering. In this paper, we present an ASR errors detection system targeted towards substitution and insertion errors. The proposed system is based on supervised learning techniques and uses input features deducted only from the ASR output words and hence should be usable with any ASR system. Applying this system on TV program transcription data leads to identify 40.30% of the recognition errors generated by the ASR system.

show abstract

The MGB challenge: Evaluating multi-genre broadcast media recognition

Cited by 125 publications

References 22 publications

A Contextual Study of Semantic Speech Editing in Radio Production

A Contextual Study of Semantic Speech Editing in Radio Production

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

Automatic speech recognition errors detection using supervised learning techniques

Contact Info

Product

Resources

About