6th International Conference on Spoken Language Processing (ICSLP 2000) 2000
DOI: 10.21437/icslp.2000-42
|View full text |Cite
|
Sign up to set email alerts
|

Particle-based language modelling

Abstract: This paper investigates the use of particle (sub-word) £ -grams for language modelling. One linguistics-based and two datadriven algorithms are presented and evaluated in terms of perplexity for Russian and English. Interpolating word trigram and particle 6-gram models gives up to a 7.5% perplexity reduction over the baseline word trigram model for Russian. Lattice rescoring experiments are also performed on 1997 DARPA Hub4 evaluation lattices where the interpolated model gives a 0.4% absolute reduction in wor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2006
2006
2021
2021

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(1 citation statement)
references
References 7 publications
0
1
0
Order By: Relevance
“…We have also evaluated the word boundary modelling with mostly similar results. For Finnish, the dedicated word boundary symbol [47,15] has so far been the most effective approach, whereas for Estonian, using the redundant approach has sometimes resulted in a small improvement. For the experiments in this work, the dedicated word boundary symbol was used because it provided the best or equal results in all cases.…”
Section: Subword Language Modelsmentioning
confidence: 99%
“…We have also evaluated the word boundary modelling with mostly similar results. For Finnish, the dedicated word boundary symbol [47,15] has so far been the most effective approach, whereas for Estonian, using the redundant approach has sometimes resulted in a small improvement. For the experiments in this work, the dedicated word boundary symbol was used because it provided the best or equal results in all cases.…”
Section: Subword Language Modelsmentioning
confidence: 99%