The THUEE system for the openKWS14 keyword search evaluation

Cai, Meng; Lv, Zhiqiang; Song, Beili; Shi, Yongzhe; Wu, Wei; Lü, Cheng; Zhang, Weiqiang; Liu, Jia

doi:10.1109/icassp.2015.7178869

Cited by 3 publications

(2 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The baseline system S1 uses convolutional maxout neural network acoustic model [14,15] Among them, S1-S10 are based on Kaldi, while S11 is based on HTK. The language model of S1-10 is a word trigram language model, while S11 utilizes a feed-forward neural network language model with variance regularizations [19].…”

Section: Kws Systemsmentioning

confidence: 99%

Improved system fusion for keyword search

Cai

Lü

et al. 2015

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

Self Cite

View full text Add to dashboard Cite

It has been demonstrated that system fusion can significantly improve the performance of keyword search. In this paper, we compare the performance of several widely-used arithmeticbased fusion methods using different normalization pipeline and try to find the best pipeline. A novel arithmetic-based fusion method is proposed in this work. The method supplies a more effective way to incorporate the number of systems which have non-zero scores for a detection. When tested on the development test dataset of the OpenKWS15 Evaluation, the proposed method achieves the highest maximum termweighted value (MTWV) and actual term-weighted value (ATWV) among all other arithmetic-based fusion methods. Usually, discriminative fusion methods employing classifiers can outperform arithmetic-based fusion methods. A DNNbased fusion method is explored in this work. After wordburst information is added, the DNN-based fusion method outperforms all other methods. In addition, it is notable that our arithmetic-based method achieves the same MTWV as the DNN-based method.

show abstract

Section: Kws Systemsmentioning

confidence: 99%

Improved system fusion for keyword search

Cai

Lü

et al. 2015

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Best performance in the NIST Open KWS 2013 evaluation is ATWV=0.6248 [110] under the Full Language Pack (FullLP) condition, for which 20 h of word-transcribed scripted speech, 80 h of word-transcribed CTS, and a pronunciation lexicon were given to participants. In the works describing systems on the surprise language (i.e., Tamil) of the Open KWS 2014 evaluation [53,92,94,[111][112][113][114][115][116][117], ATWV=0.5802 is the best performance obtained under the FullLP condition, for which 60 h of transcribed speech and a pronunciation lexicon were given to participants.…”

Section: Comparison To Other Evaluationsmentioning

confidence: 99%

Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion

Tejedor

Toledano

López-Otero

et al. 2015

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

Spoken term detection (STD) aims at retrieving data from a speech repository given a textual representation of the search term. Nowadays, it is receiving much interest due to the large volume of multimedia information. STD differs from automatic speech recognition (ASR) in that ASR is interested in all the terms/words that appear in the speech data, whereas STD focuses on a selected list of search terms that must be detected within the speech data. This paper presents the systems submitted to the STD ALBAYZIN 2014 evaluation, held as a part of the ALBAYZIN 2014 evaluation campaign within the context of the IberSPEECH 2014 conference. This is the first STD evaluation that deals with Spanish language. The evaluation consists of retrieving the speech files that contain the search terms, indicating their start and end times within the appropriate speech file, along with a score value that reflects the confidence given to the detection of the search term. The evaluation is conducted on a Spanish spontaneous speech database, which comprises a set of talks from workshops and amounts to about 7 h of speech. We present the database, the evaluation metrics, the systems submitted to the evaluation, the results, and a detailed discussion. Four different research groups took part in the evaluation. Evaluation results show reasonable performance for moderate out-of-vocabulary term rate. This paper compares the systems submitted to the evaluation and makes a deep analysis based on some search term properties (term length, in-vocabulary/out-of-vocabulary terms, single-word/multi-word terms, and in-language/foreign terms).

show abstract

High-performance Swahili keyword search with very limited language pack: The THUEE system for the OpenKWS15 evaluation

Cai

Lü

et al. 2015

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

View full text Add to dashboard Cite

The THUEE system for the openKWS14 keyword search evaluation

Cited by 3 publications

References 24 publications

Improved system fusion for keyword search

Improved system fusion for keyword search

Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion

High-performance Swahili keyword search with very limited language pack: The THUEE system for the OpenKWS15 evaluation

Contact Info

Product

Resources

About