2021
DOI: 10.1007/978-3-030-83527-9_8
|View full text |Cite
|
Sign up to set email alerts
|

A Database and Visualization of the Similarity of Contemporary Lexicons

Abstract: Lexical similarity data, quantifying the "proximity" of languages based on the similarity of their lexicons, has been increasingly used to estimate the cross-lingual reusability of language resources, for tasks such as bilingual lexicon induction or cross-lingual transfer. Existing similarity data, however, originates from the field of comparative linguistics, computed from very small expert-curated vocabularies that are not supposed to be representative of modern lexicons. We explore a different, fully automa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 11 publications
0
5
0
Order By: Relevance
“…( Notes: ∆ Imm Lang Same is the change in the share of immigrants speaking Spanish (All of America, except Brazil). ∆ Imm Lang Med and ∆ Imm Lang Far is the change in the share of immigrants speaking languages relatively close or far from Spanish based on a lexical similarity score (Bella et al 2021). Table A15 shows the list of countries and similarity scores in each group.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…( Notes: ∆ Imm Lang Same is the change in the share of immigrants speaking Spanish (All of America, except Brazil). ∆ Imm Lang Med and ∆ Imm Lang Far is the change in the share of immigrants speaking languages relatively close or far from Spanish based on a lexical similarity score (Bella et al 2021). Table A15 shows the list of countries and similarity scores in each group.…”
Section: Discussionmentioning
confidence: 99%
“…Language is one of many dimensions of culture, and research suggests that cultural distance between different groups in a society may negatively affect their willingness to redistribute or provide public goods (Luttmer andSinghal 2011, Desmet et al 2009). I test for this channel by borrowing a measure of language similarity between Spanish and immigrant's countries of origin main (most spoken) language from Bella et al (2021). Based on that measure, I divide immigrants between those speaking the same language (All America except Brazil), a relatively similar language, or a relatively distant language.…”
Section: Mechanisms and Effects On Anti-immigration Platformsmentioning
confidence: 99%
See 1 more Smart Citation
“…To explain where the differences across languages come from, we compute how the differences correlate with the geographical distance of the countries where the languages are spoken, the GDP of the countries, and the lexical similarity of the languages (Bella et al, 2021). 3…”
Section: Discussionmentioning
confidence: 99%
“…The Universal Knowledge Core (UKC) [24,25] is a large-scale MLDB that contains about 2 million words in over 2,000 languages [8]. 12 It integrates a variety of resources such as individual wordnets such as [23,9], Wiktionary, as well as original multilingual content on phenomena related to linguistic diversity [24], such as cognacy [5], metonymy [30], lexical gaps [29], morphology [4], lexical similarity [6]. The UKC has a two-layered architecture, with a language layer that contains a separate wordnet-like graph (with words, senses, and synsets) for each language, as well as a supra-lingual layer of interlingual concepts [25] (Figure 4).…”
Section: The Universal Knowledge Corementioning
confidence: 99%