Proceedings of the Second Workshop on Computational Research in Linguistic Typology 2020
DOI: 10.18653/v1/2020.sigtyp-1.1
|View full text |Cite
|
Sign up to set email alerts
|

SIGTYP 2020 Shared Task: Prediction of Typological Features

Abstract: Typological knowledge bases (KBs) such as WALS (Dryer and Haspelmath, 2013) contain information about linguistic properties of the world's languages. They have been shown to be useful for downstream applications, including cross-lingual transfer learning and linguistic probing. A major drawback hampering broader adoption of typological KBs is that they are sparsely populated, in the sense that most languages only have annotations for some features, and skewed, in that few features have wide coverage. As typolo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
12
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(14 citation statements)
references
References 30 publications
0
12
0
Order By: Relevance
“…The SIGTYP 2020 shared task (Bjerva et al, 2020) splits the WALS data into training, development and test portion. In the blind test data, some feature values are masked and the participating system is supposed to predict them based on the remaining features that are left visible.…”
Section: Task and Datamentioning
confidence: 99%
“…The SIGTYP 2020 shared task (Bjerva et al, 2020) splits the WALS data into training, development and test portion. In the blind test data, some feature values are masked and the participating system is supposed to predict them based on the remaining features that are left visible.…”
Section: Task and Datamentioning
confidence: 99%
“…The dataset 1 used for this experiment was extracted from World Atlas of Language Structures (WALS) (Dryer and Haspelmath, 2013). It covered the typological features of close to 2,000 languages (Bjerva et al, 2020). These typological features were organised in 8 columns (including Language ID, Language name, Latitude, Longitude, Genus, Family, Country Codes, and feature-value.…”
Section: Datasetmentioning
confidence: 99%
“…Developing method ologies for accurately predicting missing typolog ical features on the basis of existing knowledge is therefore crucial for a wider adoption of typologi cal resources in NLP tasks and beyond (Evans and Levinson, 2009). This paper presents the work done by the "NEMO Team" (Google London and Tokyo) on the constrained subtask for the SIGTYP 2020 Shared Task (Bjerva et al, 2020). We experimented with a variety of machine learning models using only the features provided in the training, devel opment and test sets.…”
Section: Introductionmentioning
confidence: 99%
“…
This paper describes the NEMO submission to SIGTYP 2020 shared task (Bjerva et al, 2020) which deals with prediction of linguis tic typological features for multiple languages using the data derived from World Atlas of Language Structures (WALS). We employ fre quentist inference to represent correlations be tween typological features and use this repre sentation to train simple multiclass estimators that predict individual features.
…”
mentioning
confidence: 99%