2001
DOI: 10.1006/csla.2000.0156
|View full text |Cite
|
Sign up to set email alerts
|

Improved language modelling through better language model evaluation measures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0
1

Year Published

2002
2002
2020
2020

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(13 citation statements)
references
References 5 publications
0
12
0
1
Order By: Relevance
“…It is supposed that the "best" models get the "lowest" Word Error Rates (WER) in the CSR system, but there are many contra examples in literature (Rosenfeld, 2000). The ability of the test set perplexity to predict the real behavior of a smoothing technique when the smoothed LM is working into a CSR system could be questioned (Clarkson and Robinson, 1999) since it does not take into account the relationship with acoustic models. Several attempts have been made to devise metrics that are better correlated with Word Error Rates than perplexity (Clarkson and Robinson, 1999;Bimbot et al, 2001), but for now perplexity remains the main metric for practical language model construction (Rosenfeld, 2000).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…It is supposed that the "best" models get the "lowest" Word Error Rates (WER) in the CSR system, but there are many contra examples in literature (Rosenfeld, 2000). The ability of the test set perplexity to predict the real behavior of a smoothing technique when the smoothed LM is working into a CSR system could be questioned (Clarkson and Robinson, 1999) since it does not take into account the relationship with acoustic models. Several attempts have been made to devise metrics that are better correlated with Word Error Rates than perplexity (Clarkson and Robinson, 1999;Bimbot et al, 2001), but for now perplexity remains the main metric for practical language model construction (Rosenfeld, 2000).…”
Section: Introductionmentioning
confidence: 99%
“…The ability of the test set perplexity to predict the real behavior of a smoothing technique when the smoothed LM is working into a CSR system could be questioned (Clarkson and Robinson, 1999) since it does not take into account the relationship with acoustic models. Several attempts have been made to devise metrics that are better correlated with Word Error Rates than perplexity (Clarkson and Robinson, 1999;Bimbot et al, 2001), but for now perplexity remains the main metric for practical language model construction (Rosenfeld, 2000). In fact, the quality of the model must be ultimately measured by its effect on the specific task for which it was designed, namely by its effect on the system error rate.…”
Section: Introductionmentioning
confidence: 99%
“…The ability of the test set perplexity to predict the real behavior of a smoothing technique when working in a CSR system could be questioned because it does not take into account the relationship with acoustic models. Several attempts have been made to devise metrics that are better correlated with the application error rate than perplexity [4]. But for now perplexity remains the main metric for practical language model construction [3].…”
Section: Introductionmentioning
confidence: 99%
“…La correlación entre PPL y WER/BLEU es un tema que ha sido investigado por muchos autores [Iyer et al, 1997;Chen et al, 1998;Clarkson & Robinson, 1999;Ito et al, 1999;…”
Section: Correlación Entre Ppl Y Werunclassified
“…The correlation between PPL and the final system error (measured as WER, BLEU, or TER) is very task dependent and is not totally clear. Some authors made studies of the relationship between PPL and WER [Iyer et al, 1997;Chen et al, 1998;Clarkson & Robinson, 1999;Ito et al, 1999;Printz & Olsen, 2002;Klakow & Peters, 2002], concluding that the correlation is hidden by the test set characteristics, nevertheless the correlation exists. As an illustrative example, Figure 2.1 on page 30 shows the evolution of WER vs PPL on the IAM task [Marti & Bunke, 1999] of HTR and the French Media task [Bonneau-Maynard et al, 2005] of ASR.…”
Section: D21 Statistical Language Modelsmentioning
confidence: 99%