2012 IEEE Spoken Language Technology Workshop (SLT) 2012
DOI: 10.1109/slt.2012.6424248
|View full text |Cite
|
Sign up to set email alerts
|

Recovery of acronyms, out-of-lattice words and pronunciations from parallel multilingual speech

Abstract: In this work we present a set of techniques which explore information from multiple, different language versions of the same speech, to improve Automatic Speech Recognition (ASR) performance. Using this redundant information we are able to recover acronyms, words that cannot be found in the multiple hypotheses produced by the ASR systems, and pronunciations absent from their pronunciation dictionaries. When used together, the three techniques yield a relative improvement of 5.0% over the WER of our baseline sy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2013
2013
2016
2016

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 9 publications
0
2
0
Order By: Relevance
“…Additionally, word lattice based approaches have also been pursued [2,3]. The transcription of multiple streams of interpreted speech has also been addressed with the aid of machine translation [16,17]. However, in all of these works the translation models are trained on substantial external written corpora such as European parliament proceedings or the Canadian Hansards.…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, word lattice based approaches have also been pursued [2,3]. The transcription of multiple streams of interpreted speech has also been addressed with the aid of machine translation [16,17]. However, in all of these works the translation models are trained on substantial external written corpora such as European parliament proceedings or the Canadian Hansards.…”
Section: Related Workmentioning
confidence: 99%
“…This process is described in Section 3. In [6], we also recover words that are not in the lattices produced by the recognizer, acronyms and pronunciations, using the redundancy provided by multiple streams. The current paper extends these works by integrating a new type of stream, which consists of slides, rather than speech, in the existing framework.…”
Section: Relation To Previous Workmentioning
confidence: 99%