2016 IEEE Spoken Language Technology Workshop (SLT) 2016
DOI: 10.1109/slt.2016.7846273
|View full text |Cite
|
Sign up to set email alerts
|

Voice search language model adaptation using contextual information

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
24
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
1

Relationship

3
5

Authors

Journals

citations
Cited by 16 publications
(25 citation statements)
references
References 9 publications
0
24
0
Order By: Relevance
“…There are many speech recognition applications where on-thefly adjustment is indispensable. In the voice search task, we have previously shown quality improvements by introducing ngram weight adjustments for salient n-grams from personal contexts and geographic information [2] as well as improved recognition of contact names using context [11]. Products such as the Google Assistant use context for all types of personal entities (e.g.…”
Section: Contextual Speech Recognitionmentioning
confidence: 99%
See 1 more Smart Citation
“…There are many speech recognition applications where on-thefly adjustment is indispensable. In the voice search task, we have previously shown quality improvements by introducing ngram weight adjustments for salient n-grams from personal contexts and geographic information [2] as well as improved recognition of contact names using context [11]. Products such as the Google Assistant use context for all types of personal entities (e.g.…”
Section: Contextual Speech Recognitionmentioning
confidence: 99%
“…Contextual signals can include: a user's location, the device being used, or personalization information such as a user's favorite songs and calendar events ( Figure 1). Including this information has been shown to improve recognition results [2]. Our contextual ASR system was previously built on a conventional architecture, and in this paper we propose a design to bring similar improvements to E2E architectures.…”
Section: Introductionmentioning
confidence: 99%
“…Our approach is inspired by the empirical observations that (1) NEs often occur in common word contexts that are easily recognized by a first-pass decoder (e.g., "call X mobile"); (2) when an NE is misrecognized, the incorrect hypothesis still approximates the ground-truth phonemes spoken by the user. For example, the utterance "call Goudzwaard" may be misrecognized as "call god's word", but the presence of "call" indicates a possible NE, while the phonetics are sufficiently close to match the missing NE based on knowledge of user state (e.g., the name "Goudzwaard" is in the user's contact list).…”
Section: System Designmentioning
confidence: 99%
“…Many techniques have been developed to take advantage of contextual signals. For example, in language model (LM)based ASR systems, rescoring methods are described that dynamically adjust LM weights; some reweigh n-grams appearing in the user's context on the fly [1,2,3], while others dynamically expand the LM via class grammars [4]. Contextual ASR (biasing) methods for end-to-end systems have also been proposed [5,6].…”
Section: Introductionmentioning
confidence: 99%
“…Rescoring adjusts LM probabilities in real time based on contextual signals, and allows targeted adjustments without the need for training a contextspecific LM. Work has been done on incorporating a variety of contextual signals into ASR: the device type being used, the history of the speaker's queries or actions, device state, location and dialog state, among others [1,2,3].…”
Section: Introductionmentioning
confidence: 99%