Proceedings of the 21st International Conference on World Wide Web 2012
DOI: 10.1145/2187980.2188060
|View full text |Cite
|
Sign up to set email alerts
|

Automated semantic tagging of speech audio

Abstract: The BBC is currently tagging programmes manually, using DBpedia as a source of tag identifiers, and a list of suggested tags extracted from the programme synopsis. These tags are then used to help navigation and topic-based search of programmes on the BBC website. However, given the very large number of programmes available in the archive, most of them having very little metadata attached to them, we need a way to automatically assign tags to programmes. We describe a framework to do so, using speech recogniti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 8 publications
0
2
0
Order By: Relevance
“…General broadcast data is recorded in diverse environments, includes dramas with highly-emotional speech, and often has overlaid background music or sound effects: word error rates (WERs) on such data are several times higher than for broadcast news and very variable across different genres. Work in this area has included automatic transcription of podcasts and other web audio [1], automatic transcription of Youtube [2,3], the MediaEval speech retrieval evaluation which used blip.tv semi-professional user created content [4], the automatic tagging of a large radio archive [5], and automatic transcription of multi-genre media archive data [6]. Recently, systems were developed for the 2015 Multi-Genre Broadcast (MGB) challenge [7][8][9][10].…”
Section: Introductionmentioning
confidence: 99%
“…General broadcast data is recorded in diverse environments, includes dramas with highly-emotional speech, and often has overlaid background music or sound effects: word error rates (WERs) on such data are several times higher than for broadcast news and very variable across different genres. Work in this area has included automatic transcription of podcasts and other web audio [1], automatic transcription of Youtube [2,3], the MediaEval speech retrieval evaluation which used blip.tv semi-professional user created content [4], the automatic tagging of a large radio archive [5], and automatic transcription of multi-genre media archive data [6]. Recently, systems were developed for the 2015 Multi-Genre Broadcast (MGB) challenge [7][8][9][10].…”
Section: Introductionmentioning
confidence: 99%
“…Recent work which has focused on the automatic transcription or indexing of multi-genre broadcast data has included work on the automatic transcription of podcasts and other web audio [1], automatic transcription of YouTube [2,3], the MediaEval rich speech retrieval evaluation which used blip.tv semi-professional user created content [4], and the automatic tagging of a large radio archive [5]. This paper concerns the automatic transcription of multigenre content from the BBC archive.…”
Section: Introductionmentioning
confidence: 99%