Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 2021
DOI: 10.18653/v1/2021.findings-acl.347
|View full text |Cite
|
Sign up to set email alerts
|

Minimally-Supervised Morphological Segmentation using Adaptor Grammars with Linguistic Priors

Abstract: With the increasing interest in low-resource languages, unsupervised morphological segmentation has become an active area of research, where approaches based on Adaptor Grammars achieve state-of-the-art results. We demonstrate the power of harnessing linguistic knowledge as priors within Adaptor Grammars in a minimally-supervised learning fashion. We introduce two types of priors: 1) grammar definition, where we design language-specific grammars; and 2) linguistprovided affixes, collected by an expert in the l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 13 publications
0
1
0
Order By: Relevance
“…MorphAGram We also include in this study the unsupervised morphology segmenter MorphA-Gram (Eskander et al, 2020) which is based on Adaptor Grammars. We use the PrStSu+SM grammar, which represents a word as a sequence of prefixes followed by a stem then a sequence of suffixes, in the unsupervised Standard learning setting to train the segmenters.…”
Section: Segmentation Systemsmentioning
confidence: 99%
“…MorphAGram We also include in this study the unsupervised morphology segmenter MorphA-Gram (Eskander et al, 2020) which is based on Adaptor Grammars. We use the PrStSu+SM grammar, which represents a word as a sequence of prefixes followed by a stem then a sequence of suffixes, in the unsupervised Standard learning setting to train the segmenters.…”
Section: Segmentation Systemsmentioning
confidence: 99%
“…We configured several baselines: (1) based on a simple weighted Finite-State Transducer (FST) to maximise the morpheme frequency (Richardson and Tyers, 2021), ( 2) based on Morfessor version 2.0 (Virpioja et al, 2013) to learn the morpheme boundaries using minimum description length optimization, and (3) based on the Adaptor Grammar approach. We used the MorphAGram toolkit (Eskander et al, 2020), with two settings: standard setting (AdaGra-Std) and scholar seeded setting (AdaGra-SS). We adopted the best learning settings: the best standard PrefixStemSuffix+SuffixMorph grammar and the best scholar-seeded grammar, as explained in (Eskander et al, 2019), for Innu-Aimun.…”
Section: Training Settingsmentioning
confidence: 99%