2022
DOI: 10.48550/arxiv.2204.05049
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Using Linguistic Typology to Enrich Multilingual Lexicons: the Case of Lexical Gaps in Kinship

Abstract: This paper describes a method to enrich lexical resources with content relating to linguistic diversity, based on knowledge from the field of lexical typology. We capture the phenomenon of diversity through the notions of lexical gap and language-specific word and use a systematic method to infer gaps semi-automatically on a large scale. As a first result obtained for the domain of kinship terminology, known to be very diverse throughout the world, we publish a lexico-semantic resource consisting of 198 domain… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 27 publications
0
2
0
Order By: Relevance
“…2, taken from a mainstream machine translator, show examples of erroneous translations due to untranslatability. As reported by (Khishigsuren et al, 2022), when translating the English sentence My brother is three years younger than me to Hungarian, Mongolian, Korean, or Japanese, syntactically correct yet semantically absurd results are obtained:…”
Section: Machine Translationmentioning
confidence: 81%
See 2 more Smart Citations
“…2, taken from a mainstream machine translator, show examples of erroneous translations due to untranslatability. As reported by (Khishigsuren et al, 2022), when translating the English sentence My brother is three years younger than me to Hungarian, Mongolian, Korean, or Japanese, syntactically correct yet semantically absurd results are obtained:…”
Section: Machine Translationmentioning
confidence: 81%
“…Giunchiglia et al (2017) used a quantified measure of the diversity of sets of languages for the prediction of the universality or specificity of linguistic phenomena. Khishigsuren et al (2022) used results from in-depth, local field studies to better understand the meaning of family relations in order to produce accurate kinship terminologies in no less than 600 languages. In Bella et al (2020), an about 10-thousand-word formal lexicon of Scottish Gaelic was co-created by local language experts, including locally specific terms not directly translatable to English or most other languages.…”
Section: Linguistic Diversitymentioning
confidence: 99%
See 1 more Smart Citation
“…The Universal Knowledge Core (UKC) [24,25] is a large-scale MLDB that contains about 2 million words in over 2,000 languages [8]. 12 It integrates a variety of resources such as individual wordnets such as [23,9], Wiktionary, as well as original multilingual content on phenomena related to linguistic diversity [24], such as cognacy [5], metonymy [30], lexical gaps [29], morphology [4], lexical similarity [6]. The UKC has a two-layered architecture, with a language layer that contains a separate wordnet-like graph (with words, senses, and synsets) for each language, as well as a supra-lingual layer of interlingual concepts [25] (Figure 4).…”
Section: The Universal Knowledge Corementioning
confidence: 99%
“…Finally, Section 5 provides the conclusion. Throughout the paper we will use the example of family relationships-well known to be expressed in diverse manners across languages [29]-and in particular the notion of cousin, in nine languages: English, French, Italian, Chinese, Hindi, Tamil, Malayalam, Hungarian, and Mongolian. 1 2 Cross-Lingual Lexical Mappings Lexical equivalence is understood by linguists as a complex and multidimensional problem, ranging from multiple coexisting forms of meaning equivalence [1] to untranslatability [16] (see Table 1 for examples).…”
Section: Introductionmentioning
confidence: 99%