Andrew McParland scite author profile

This paper describes the Multi-Genre Broadcast (MGB) Challenge at ASRU 2015, an evaluation focused on speech recognition, speaker diarization, and "lightly supervised" alignment of BBC TV recordings. The challenge training data covered the whole range of seven weeks BBC TV output across four channels, resulting in about 1,600 hours of broadcast audio. In addition several hundred million words of BBC subtitle text was provided for language modelling. A novel aspect of the evaluation was the exploration of speech recognition and speaker diarization in a longitudinal setting -i.e. recognition of several episodes of the same show, and speaker diarization across these episodes, linking speakers. The longitudinal tasks also offered the opportunity for systems to make use of supplied metadata including show title, genre tag, and date/time of transmission. This paper describes the task data and evaluation process used in the MGB challenge, and summarises the results obtained.

show abstract

BBC-Oxford British Sign Language Dataset

Albanie¹,

Varol²,

Momeni³

et al. 2021

Preprint

View full text Add to dashboard Cite

Personalization of broadcast programs using synchronized internet content

Matsumura

Evans²,

Shishikui

et al. 2010

View full text Add to dashboard Cite

Using the Past to Explain the Present: Interlinking Current Affairs with Archives via the Semantic Web

Raimond¹,

Smethurst²,

McParland³

et al. 2013

View full text Add to dashboard Cite

The BBC has a very large archive of programmes, covering a wide range of topics. This archive holds a significant part of the BBC's institutional memory and is an important part of the cultural history of the United Kingdom and the rest of the world. These programmes, or parts of them, can help provide valuable context and background for current news events. However the BBC's archive catalogue is not a complete record of everything that was ever broadcast. For example, it excludes the BBC World Service, which has been broadcasting since 1932. This makes the discovery of content within these parts of the archive very difficult. In this paper we describe a system based on Semantic Web technologies which helps us to quickly locate content related to current news events within those parts of the BBC's archive with little or no pre-existing metadata. This system is driven by automated interlinking of archive content with the Semantic Web, user validations of the resulting data and topic extraction from live BBC News subtitles. The resulting interlinks between live news subtitles and the BBC's archive are used in a dynamic visualisation enabling users to quickly locate relevant content. This content can then be used by journalists and editors to provide historical context, background information and supporting content around current affairs.

show abstract

Speech-to-Text for Broadcasters, From Research to Implementation

Haynes¹,

Norton²,

McParland³

et al. 2018

SMPTE Mot. Imag. J

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.