2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
DOI: 10.1109/icassp.2004.1326104
|View full text |Cite
|
Sign up to set email alerts
|

Improved name recognition with meta-data dependent name networks

Abstract: A transcription system that requires accurate general name transcription is faced with the problem of covering the large number of names it may encounter. Without any prior knowledge, this requires a large increase in the size and complexity of the system due to the expansion of the lexicon. Furthermore, this increase will adversely affect the system performance due to the increased confusability. Here we propose a method that uses meta-data, available at runtime to ensure better name coverage without signific… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 10 publications
0
4
0
Order By: Relevance
“…Aleksic et al [13] extend class-based LMs [4,5] by creating a user-dependent small LM for contact name recognition on voice commands, which is compiled 1 CTC is the abbreviation for Connectionist Temporal Classification. 2 GMM and HMM are short for Gaussian Mixture Model and Hidden Markov Model respectively.…”
Section: The Recognition Of Oov Words In End-to-end Asr Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…Aleksic et al [13] extend class-based LMs [4,5] by creating a user-dependent small LM for contact name recognition on voice commands, which is compiled 1 CTC is the abbreviation for Connectionist Temporal Classification. 2 GMM and HMM are short for Gaussian Mixture Model and Hidden Markov Model respectively.…”
Section: The Recognition Of Oov Words In End-to-end Asr Modelsmentioning
confidence: 99%
“…Since it takes substantial efforts to collect labeled OOV speech data for ASR model training, current approaches to tackle the OOV problem mainly involve a language model (LM) or post-processing, for instance, user-dependent language models [4,5], LM rescoring [6] and finite-state transducer lattice extension [7].…”
Section: Introductionmentioning
confidence: 99%
“…For example, names in user's contact list are usually out-of-vocabulary (OOV) and are likely to have very low language model score, thereby making it difficult to accurately predict. These contextual terms can be personal, such as names in the user's contact [3,4], current location [5,6], and songs in the playlist [7]; topic-specific, such as medical domain [8]; or trending terms [9]. In all these scenarios the contextual information is not static and therefore needs to be dynamically incorporated into the language model during the inference stage.…”
Section: Introductionmentioning
confidence: 99%
“…In the case when meta-data are available, an entire class of terms can be biased [3,4,5,10,11]. The general idea is to replace every instance of phrases with its class-label to construct a class-based language model [12], and dynamically expand the decoding graph of the class-label into class instances provided in the context during inference.…”
Section: Introductionmentioning
confidence: 99%