2003
DOI: 10.1007/978-3-540-24586-5_31
|View full text |Cite
|
Sign up to set email alerts
|

Decision Tree-Based Context Dependent Sublexical Units for Continuous Speech Recognition of Basque

Abstract: Abstract. This paper presents a new methodology, based on the classical decision trees, to get a suitable set of context dependent sublexical units for Basque Continuous Speech Recognition (CSR). The original method proposed by Bahl [1] was applied as the benchmark. Then two new features were added: a data massaging to emphasise the data and a fast and efficient Growing and Pruning algorithm for DT construction. In addition, the use of the new context dependent units to build word models was addressed. The ben… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2014
2014
2014
2014

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 8 publications
(8 reference statements)
0
1
0
Order By: Relevance
“…The decomposed words are post-processed to produce a cleaner set of sublexical units and boundary markers are added to regenerate fullwords later after recognition. Very short units are avoided as they are usually difficult to recognize and also could harm the overall WER with more insertion errors [30,31]. To generate N -gram backoff sub-lexical LMs, different hybrid vocabularies are selected, where top-most 5k full-word forms are preserved.…”
Section: Language Model (Lm)mentioning
confidence: 99%
“…The decomposed words are post-processed to produce a cleaner set of sublexical units and boundary markers are added to regenerate fullwords later after recognition. Very short units are avoided as they are usually difficult to recognize and also could harm the overall WER with more insertion errors [30,31]. To generate N -gram backoff sub-lexical LMs, different hybrid vocabularies are selected, where top-most 5k full-word forms are preserved.…”
Section: Language Model (Lm)mentioning
confidence: 99%