Multi-Scale Spoken Document Retrieval for Cantonese Broadcast News

Lo, Wai Kit; Meng, Helen; Ching, P.C.

doi:10.1023/b:ijst.0000017020.53797.a0

Cited by 7 publications

(3 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…phonemes, syllables and sub-phonetic segments) have shown robustness to speech recognition errors and OOV words in spoken document retrieval (SDR) [23] tasks. Especially for Chinese, retrieval based on character or syllable indexing is superior to words due to the special features of Chinese [5,22]. We believe that subwords should also be effective in story segmentation of erroneous broadcast news transcripts through partial matching.…”

Section: Related Workmentioning

confidence: 99%

On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news

Xie

Yang

Liu

2011

Information Sciences

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news

Xie

Yang

Liu

2011

Information Sciences

View full text Add to dashboard Cite

“…In order to process Turkish language documents recognition and indexing units are used as sub-words units by Parlak et al to reduce both the OOV rate and the index alternative recognition hypothesis to handle ASR errors [26]. Some researcher such as Lo et al [4] concentrated on the application of a multi-scale paradigm for Chinese SDR to simply improve retrieval performance. BASRAH [18] system has been designed to detect story boundaries in multilingual (English and Malay) using Confidence Measures (CMs) of the ASR.…”

Section: Related Workmentioning

confidence: 99%

“…A spoken document retrieval (SDR) system uses automatic speech recognition and information retrieval technologies to analyze and process multimedia documents [1]- [4]. Automatic speech recognition (ASR) systems are used to convert spoken documents (speech) into text transcription.…”

Section: Introductionmentioning

confidence: 99%

MAHIR System: Unsupervised Segmentation for Malay Spoken Broadcast News Stories

Khalaf¹

2015

IJIEE

View full text Add to dashboard Cite

Current studies on spoken document retrieval (SDR) systems concentrate on building strong systems using an approach that reduces the impact of automatic speech recognition (ASR) on retrieval performance. Herein we tend to propose the SDR system, the main goal of that is to reduce the effect of ASR transcription errors on retrieval performance. An automatic speech recognition system is employed to convert the Malay spoken broadcast news to text. The performance of unsupervised learning is evaluated on the Malay broadcast news using apriori algorithm.

show abstract

The BASRAH System: A Method for Spoken Broadcast News Story Clustering

Aleqili

2012

Networked Digital Technologies

View full text Add to dashboard Cite

Multi-Scale Spoken Document Retrieval for Cantonese Broadcast News

Cited by 7 publications

References 36 publications

On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news

On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news

MAHIR System: Unsupervised Segmentation for Malay Spoken Broadcast News Stories

The BASRAH System: A Method for Spoken Broadcast News Story Clustering

Contact Info

Product

Resources

About