2010
DOI: 10.1007/978-3-642-12275-0_42
|View full text |Cite
|
Sign up to set email alerts
|

On Foreign Name Search

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2010
2010
2024
2024

Publication Types

Select...
4
1
1

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…For completeness, Segments is a system that takes an input string, and using 6 substring rules, returns a list of possible correction candidates derived from a lexicon, ranked by similarity. A detailed Segments description is found in [21,22,20,23]. Recent research has reaffirmed the potential of segmenting strings by using said segments to perform authorship attribution [19].…”
Section: Segmentsmentioning
confidence: 99%
“…For completeness, Segments is a system that takes an input string, and using 6 substring rules, returns a list of possible correction candidates derived from a lexicon, ranked by similarity. A detailed Segments description is found in [21,22,20,23]. Recent research has reaffirmed the potential of segmenting strings by using said segments to perform authorship attribution [19].…”
Section: Segmentsmentioning
confidence: 99%
“…UNLV [25], IMPACT [26]), but these datasets are not applicable to our work, as these datasets do not provide means to accurately evaluate our system; namely, they are lacking query relevance (qrel) judgments. Without those, we would only be measuring the correction accuracy of Segments, which has already been exhaustively studied in prior papers using heterogeneous datasets [27], [17], [18]. Therefore, despite the age of the TREC collection, it remains the only collection that provides ground truth, corrupted text, and 3rd party qrel judgments, in a publicly available package.…”
Section: B Limitationsmentioning
confidence: 99%
“…Over the past years, we evaluated methods for reliably correcting phase one errors via post-processing using our method called Segments [17], [18], [19]. Segments differs from previous research in that it is an unsupervised approach, which makes minimal assumptions about resource availability, and has no dependence on language within the algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…In general, supervised algorithms outperform unsupervised algorithms, particularly in cases in which context is important in correcting a word (Lim, ); however, they cannot be used in the absence of training data. We describe an unsupervised approach that has no dependence on domain, language structure, or sequential windows (Soo, ; Soo & Frieder, ). The proposed solution outperforms prior unsupervised solutions and is comparable with a leading supervised approach.…”
Section: Introductionmentioning
confidence: 99%