Sequences in Language and Text 2015
DOI: 10.1515/9783110362879-012
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Evaluation of String Similarity Measures for Automatic Language Classification

Abstract: Historical linguistics, the oldest branch of modern linguistics, deals with language-relatedness and language change across space and time. Historical linguists apply the widely-tested comparative method [Durie and Ross, 1996] to establish relationships between languages to posit a language family and to reconstruct the proto-language for a language family. 1Although historical linguistics has parallel origins with biology [Atkinson and Gray, 2005], unlike the biologists, mainstream historical linguists have … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0
1

Year Published

2015
2015
2020
2020

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(25 citation statements)
references
References 34 publications
(29 reference statements)
0
24
0
1
Order By: Relevance
“…The comparison of possible cognates initiated by Gleason and Kay has continued in various ways to the present, and the comparison of languages by automated 'cognate' recognition remains a growing area of research, with multiple methods in use, including the comparison of entire dictionaries (Rama & Borin 2015;List et al 2017;Rama et al 2017). Many of these studies, driven by the needs of machine translation, count as 'cognates' pairs like English nature : French nature as well as Indo-European cognates such as English brother : French fr ere, which have proved methodologically challenging (Inkpen et al 2005).…”
Section: Automated Cognate Identificationmentioning
confidence: 99%
“…The comparison of possible cognates initiated by Gleason and Kay has continued in various ways to the present, and the comparison of languages by automated 'cognate' recognition remains a growing area of research, with multiple methods in use, including the comparison of entire dictionaries (Rama & Borin 2015;List et al 2017;Rama et al 2017). Many of these studies, driven by the needs of machine translation, count as 'cognates' pairs like English nature : French nature as well as Indo-European cognates such as English brother : French fr ere, which have proved methodologically challenging (Inkpen et al 2005).…”
Section: Automated Cognate Identificationmentioning
confidence: 99%
“…Nevertheless, the current study also includes the Swadesh wordlists since these have been widely used in the literature on quantitative comparison of the major Romance and other Indo-European languages (e.g. Forster, Toth, & Bandelt, 1998;McMahon & McMahon, 2003;Rama et al, 2015).…”
Section: Wordlist Comparisonmentioning
confidence: 99%
“…Although the use of an edit-distance measure in historical linguistics seems to have developed independently (Wichmann et al 2010), there is now an active exchange of ideas on the best refinements supporting historical inference (List 2012, Jäger 2014, Rama & Borin 2015. All of these papers explore enhancements of edit distance in which the cost of replacing a sound with another is systematically lower when the two sounds are phonetically Multidimensional scaling (MDS): a dimension-reduction technique for distance matrices similar.…”
Section: Related Workmentioning
confidence: 99%