Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05 2005
DOI: 10.3115/1220575.1220695
|View full text |Cite
|
Sign up to set email alerts
|

Learning a spelling error model from search query logs

Abstract: Applying the noisy channel model to search query spelling correction requires an error model and a language model. Typically, the error model relies on a weighted string edit distance measure. The weights can be learned from pairs of misspelled words and their corrections. This paper investigates using the Expectation Maximization algorithm to learn edit distance weights directly from search query logs, without relying on a corpus of paired words.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
40
0

Year Published

2009
2009
2023
2023

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 77 publications
(40 citation statements)
references
References 10 publications
0
40
0
Order By: Relevance
“…However, while they investigates specifically the case of pluraliza- In addition to query reformulation, query logs have been used for a variety of tasks. They have been used to learn retrieval functions [12,1], spelling correction [2,9], query segmentation [5] and disambiguating abbreviations [24]. In this paper, we studied reformulation tasks that can make use of the limited data in an anchor log, but some of these other log-based tasks may also be able to be tackled using an anchor log.…”
Section: Related Workmentioning
confidence: 99%
“…However, while they investigates specifically the case of pluraliza- In addition to query reformulation, query logs have been used for a variety of tasks. They have been used to learn retrieval functions [12,1], spelling correction [2,9], query segmentation [5] and disambiguating abbreviations [24]. In this paper, we studied reformulation tasks that can make use of the limited data in an anchor log, but some of these other log-based tasks may also be able to be tackled using an anchor log.…”
Section: Related Workmentioning
confidence: 99%
“…Explanations on the basic techniques in information retrieval can be found in the text books on IR [131,52,6]. 3 A tutorial on sparse methods by Bach can be found at www.di.ens.fr/˜fbach/. 4 Tutorials on deep learning can be found at www.deeplearning.net/tutorial/.…”
Section: About This Surveymentioning
confidence: 99%
“…In the end, the ultimate goal is to use this knowledge to improve the search experience for the user. Search log analysis can be used to improve, for example, retrieval functions [10], spelling corrections [2], and query segmentation [13,8]. According to the large-scale Altavista search log study (which concerns about 1 billion entries), users tend to formulate short queries (2.3 words per query on average), and sessions are relatively short (2 queries on average) [15].…”
Section: Related Workmentioning
confidence: 99%