Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference 2021
DOI: 10.18653/v1/2021.crac-1.9
|View full text |Cite
|
Sign up to set email alerts
|

Data Augmentation Methods for Anaphoric Zero Pronouns

Abstract: In pro-drop language like Arabic, Chinese, Italian, Japanese, Spanish, and many others, unrealized (null) arguments in certain syntactic positions can refer to a previously introduced entity, and are thus called anaphoric zero pronouns. The existing resources for studying anaphoric zero pronoun interpretation are however still limited. In this paper, we use five data augmentation methods to generate and detect anaphoric zero pronouns automatically. We use the augmented data as additional training materials for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 41 publications
1
2
0
Order By: Relevance
“…Due to page limitation, some examples are mainly discussed in Chinese and/or English. However, most results and findings can be applied to other pro-drop languages, which is further supported by other works (Ri et al, 2021;Aloraini and Poesio, 2020;Vincent et al, 2022). In Appendix §A.1, we add details on the phenomenon in various pro-drop languages such as Arabic, Swahili, Portuguese, Hindi, and Japanese.…”
Section: Limitationssupporting
confidence: 79%
See 2 more Smart Citations
“…Due to page limitation, some examples are mainly discussed in Chinese and/or English. However, most results and findings can be applied to other pro-drop languages, which is further supported by other works (Ri et al, 2021;Aloraini and Poesio, 2020;Vincent et al, 2022). In Appendix §A.1, we add details on the phenomenon in various pro-drop languages such as Arabic, Swahili, Portuguese, Hindi, and Japanese.…”
Section: Limitationssupporting
confidence: 79%
“…ZPT is a hard task to be done alone, researchers are investigating how to leverage other related NLP tasks to improve ZPT by training models to perform multiple tasks simultaneously (Wang et al, 2018a). Since ZPT is a cross-lingual problem, researchers are exploring techniques for training models that can work across multiple languages, rather than being limited to a single language (Aloraini and Poesio, 2020).…”
Section: Data-level Methods Do Not Change Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…Another mentionpair model that was developed for the Persian language extracted hand-crafted, embedding-based, and rich semantic features of mentions and used them as input to a fully connected neural network for coreference resolution (Sahlani et al, 2020). The adaptation of an English mention-ranking model (Lee et al, 2008) to Arabic was enhanced with performance-related improvements such as the heuristic-based preprocessing of words and the use of a separately trained mention detection approach (Aloraini et al, 2020). A Siamese network architecture and an extended feature set of mentions were used for Polish coreference resolution (Niton et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…In the first model, a set of wellstudied features by existing literature (Bengtson and Roth, 2008;Durrett and Klein, 2013;Wiseman et al, 2015) are extracted for a mention and its candidate antecedents and then fed to a single-layer feed-forward neural network as input. Our second model closely follows the mention ranking approach of the end-to-end coreference solution proposed by Lee et al (2007) which was successfully applied to other languages including Arabic (Aloraini et al, 2020) and Slovenian (Klemen and Žitnik, 2022). The contextual representations of a mention and its candidate antecedent mentions are learned from pre-trained language models, and a probability distribution is obtained over all possible pairings of the mention with candidate antecedents using a two-layer feed-forward network.…”
Section: Introductionmentioning
confidence: 99%