2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 2007
DOI: 10.1109/icassp.2007.367171
|View full text |Cite
|
Sign up to set email alerts
|

Castsearch - Context Based Spoken Document Retrieval

Abstract: The paper describes our work on the development of a system for retrieval of relevant stories from broadcast news. The system utilizes a combination of audio processing and text mining. The audio processing consists of a segmentation step that partitions the audio into speech and music. The speech is further segmented into speaker segments and then transcribed using an automatic speech recognition system, to yield text input for clustering using non-negative matrix factorization (NMF). We find semantic topics … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2009
2009
2014
2014

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 6 publications
0
5
0
Order By: Relevance
“…Retrieval of broadcast news using ASR transcripts approaches the performance of text-based retrieval (Garofolo et al, 2000;Koumpis and Renals, 2005). Access to broadcast news content in the podosphere has attracted research attention (Mølgaard et al, 2007) and news and radio content remains an important application area of online audio search (e.g. www.audioclipping.de/).…”
Section: Spoken Content Retrievalmentioning
confidence: 99%
See 1 more Smart Citation
“…Retrieval of broadcast news using ASR transcripts approaches the performance of text-based retrieval (Garofolo et al, 2000;Koumpis and Renals, 2005). Access to broadcast news content in the podosphere has attracted research attention (Mølgaard et al, 2007) and news and radio content remains an important application area of online audio search (e.g. www.audioclipping.de/).…”
Section: Spoken Content Retrievalmentioning
confidence: 99%
“…Due to the time consuming assessment process, carrying out spoken document retrieval experiments on sets containing significantly less data or with an alternative evaluation process that is system-specific or not yet widely used is also commonly used in the literature (cf. Mølgaard et al, 2007;Mizuno et al, 2008;Alberti et al, 2009). Our work is most closely related to investigations using audio collections containing spontaneous conversational speech that include both spoken content and metadata.…”
Section: Spoken Content Retrievalmentioning
confidence: 99%
“…This technique has been applied widely elsewhere to genetics [14] [32] [49], document retrieval [46], document clustering [68] and image classification [27] [39]. We apply it here to our multimodal data, including the demographic variables in our model.…”
Section: Introductionmentioning
confidence: 99%
“…A recent approach to query expansion using a parallel corpus is presented by [189]. This approach uses topics discovered by way of dimensionality reduction in order to enrich user queries.…”
Section: Expansion Techniquesmentioning
confidence: 99%