In pro-drop language like Arabic, Chinese, Italian, Japanese, Spanish, and many others, unrealized (null) arguments in certain syntactic positions can refer to a previously introduced entity, and are thus called anaphoric zero pronouns. The existing resources for studying anaphoric zero pronoun interpretation are however still limited. In this paper, we use five data augmentation methods to generate and detect anaphoric zero pronouns automatically. We use the augmented data as additional training materials for two anaphoric zero pronoun systems for Arabic. Our experimental results show that data augmentation improves the performance of the two systems, surpassing the state-of-the-art results.
Interpreting anaphoric references is a fundamental aspect of our language competence that has long attracted the attention of computational linguists. The appearance of ever-larger anaphorically annotated data sets covering more and more anaphoric phenomena in ever-greater detail has spurred the development of increasingly more sophisticated computational models; as a result, the most recent state-of-the-art neural models are able to achieve impressive performance by leveraging linguistic, lexical, discourse, and encyclopedic information. This article provides a thorough survey of anaphora resolution (coreference) throughout this development, reviewing the available data sets and covering both the preneural history of the field and—in more detail—current neural models, including research on less-studied aspects of anaphoric interpretation such as bridging reference resolution and discourse deixis interpretation. Expected final online publication date for the Annual Review of Linguistics, Volume 9 is January 2023. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.