2004
DOI: 10.1016/j.csl.2003.09.001
|View full text |Cite
|
Sign up to set email alerts
|

Contemporaneous text as side-information in statistical language modeling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2006
2006
2013
2013

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 18 publications
(13 reference statements)
0
1
0
Order By: Relevance
“…Informally, the insight here is that the initial decodings of p and q, particularly in portions of high confidence, carry useful information about (1) the genres of p and q (e.g., English email), (2) the particular topics covered in p and q (e.g., oil futures), and (3) the particular n-grams that tend to recur in p and q specifically. For example, for (2), one could use a search-engine query to retrieve a small corpus of documents that appear similar to the first-pass decodings of p and q, and use them to help build "story-specific" language models Pr1 and Pr2 [10] that better predict the ngrams of documents on these topics and hence can retrieve more accurate versions of p and q on a second pass.…”
Section: Smoothed N-gram Language Modelsmentioning
confidence: 99%
“…Informally, the insight here is that the initial decodings of p and q, particularly in portions of high confidence, carry useful information about (1) the genres of p and q (e.g., English email), (2) the particular topics covered in p and q (e.g., oil futures), and (3) the particular n-grams that tend to recur in p and q specifically. For example, for (2), one could use a search-engine query to retrieve a small corpus of documents that appear similar to the first-pass decodings of p and q, and use them to help build "story-specific" language models Pr1 and Pr2 [10] that better predict the ngrams of documents on these topics and hence can retrieve more accurate versions of p and q on a second pass.…”
Section: Smoothed N-gram Language Modelsmentioning
confidence: 99%