Proceedings of the 22nd International Conference on World Wide Web 2013
DOI: 10.1145/2487788.2487834
|View full text |Cite
|
Sign up to set email alerts
|

A non-learning approach to spelling correction in web queries

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
4
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
2
1

Relationship

3
0

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 4 publications
1
4
0
Order By: Relevance
“…This suggests that 2 may be the best candidate vector size for most applications. This is supported by previous research ( [24], [19]) which show that a balance of context-free and context-dependent candidates perform best. By using a fusion method with a candidate vector size of 2, we select 2 candidates based on context (selected using bigrams) and 2 candidates not based on context (selected using Segments' substrings rules).…”
Section: A Methodssupporting
confidence: 86%
“…This suggests that 2 may be the best candidate vector size for most applications. This is supported by previous research ( [24], [19]) which show that a balance of context-free and context-dependent candidates perform best. By using a fusion method with a candidate vector size of 2, we select 2 candidates based on context (selected using bigrams) and 2 candidates not based on context (selected using Segments' substrings rules).…”
Section: A Methodssupporting
confidence: 86%
“…However, Segments's ability to rank is limited by its lack of context knowledge. When context is important, Segments performs poorly [20]. Once k = 5, the ranking becomes less important (which Segments is poor at), and selection becomes more important (which Segments is good at).…”
Section: Methodsmentioning
confidence: 99%
“…For completeness, Segments is a system that takes an input string, and using 6 substring rules, returns a list of possible correction candidates derived from a lexicon, ranked by similarity. A detailed Segments description is found in [21,22,20,23]. Recent research has reaffirmed the potential of segmenting strings by using said segments to perform authorship attribution [19].…”
Section: Segmentsmentioning
confidence: 99%
“…This collection supports comparison with their work and evaluates in context‐aware situations. Our selection processes has been outlined further by Soo (). USHMM Names, selected names from the Yizkor book collections and victim records (Amir, ). It is a collection of 13 different languages, allowing evaluation of the proposed approach on different and diverse languages. USHMM Spoken Names Query Log, foreign name user query logs.…”
Section: Discussionmentioning
confidence: 99%
“…In general, supervised algorithms outperform unsupervised algorithms, particularly in cases in which context is important in correcting a word (Lim, ); however, they cannot be used in the absence of training data. We describe an unsupervised approach that has no dependence on domain, language structure, or sequential windows (Soo, ; Soo & Frieder, ). The proposed solution outperforms prior unsupervised solutions and is comparable with a leading supervised approach.…”
Section: Introductionmentioning
confidence: 99%