Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation

Naik, Aakanksha; Rosé, Carolyn Penstein

doi:10.18653/v1/2020.acl-main.681

Cited by 21 publications

(17 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The MLP layer is 100-dimensional. These values are consistent with the setup in Naik and Rosé (2020). BERT-ADA: The domain predictor (adversary) is a 3-layer MLP with each layer having a dimensionality of 100 and ReLU activations between layers.…”

Section: Appendixsupporting

confidence: 74%

“…We propose a new method (LIW) which relies on instance weighting via language model likelihood, and contrast it with adversarial domain adaptation (ADA) and domain adaptive fine-tuning (DAFT). These two techniques have shown promise on sequence labeling tasks (Gui et al, 2017;Han and Eisenstein, 2019;Naik and Rosé, 2020), and offer an interesting contrast between approaches that jointly perform alignment and task training (ADA) and approaches that perform these steps sequentially (DAFT). Comparing all three techniques also provides us the opportunity to study which methods adapt better to different kinds of shifts between source and target domains (e.g., shifts in vocabulary, syntax, etc.…”

Section: Unsupervised Domain Adaptationmentioning

confidence: 99%

See 1 more Smart Citation

Adapting Event Extractors to Medical Data: Bridging the Covariate Shift

Naik

Lehman

Rosé

2021

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Self Cite

View full text Add to dashboard Cite

We tackle the task of adapting event extractors to new domains without labeled data, by aligning the marginal distributions of source and target domains. As a testbed, we create two new event extraction datasets using English texts from two medical domains: (i) clinical notes, and (ii) doctor-patient conversations. We test the efficacy of three marginal alignment techniques: (i) adversarial domain adaptation (ADA), (ii) domain adaptive fine-tuning (DAFT), and (iii) a new instance weighting technique based on language model likelihood scores (LIW). LIW and DAFT improve over a no-transfer BERT baseline on both domains, but ADA only improves on notes. Deeper analysis of performance under different types of shifts (e.g., lexical shift, semantic shift) explains some of the variations among models. Our best-performing models reach F1 scores of 70.0 and 72.9 on notes and conversations respectively, using no labeled target data.

show abstract

Section: Appendixsupporting

confidence: 74%

Section: Unsupervised Domain Adaptationmentioning

confidence: 99%

Adapting Event Extractors to Medical Data: Bridging the Covariate Shift

Naik

Lehman

Rosé

2021

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Self Cite

View full text Add to dashboard Cite

show abstract

“…DANNs have been applied in many NLP tasks in the last few years, mainly to sentiment classification (e.g., Ganin et al (2016), Li et al (2018a), Shen et al (2018), Rocha andLopes Cardoso (2019), Ghoshal et al (2020), to name a few), but recently to many other tasks as well: language identification (Li et al, 2018a), natural language inference (Rocha and Lopes Cardoso, 2019), POS tagging (Yasunaga et al, 2018), parsing (Sato et al, 2017), trigger identification (Naik and Rose, 2020), relation extraction Fu et al, 2017;Rios et al, 2018), and other (binary) text classification tasks like relevancy identification (Alam et al, 2018a), machine reading comprehension , stance detection (Xu et al, 2019), and duplicate question detection (Shah et al, 2018). This makes DANNs the most widely used UDA approach in NLP, as illustrated in Table 1.…”

Section: Domain Adversariesmentioning

confidence: 99%

Neural Unsupervised Domain Adaptation in NLP—A Survey

Ramponi¹,

Plank²

2020

Proceedings of the 28th International Conference on Computational Linguistics

148

108

View full text Add to dashboard Cite

Deep neural networks excel at learning from labeled data and achieve state-of-the-art results on a wide array of Natural Language Processing tasks. In contrast, learning from unlabeled data, especially under domain shift, remains a challenge. Motivated by the latest advances, in this survey we review neural unsupervised domain adaptation techniques which do not require labeled target domain data. This is a more challenging yet a more widely applicable setup. We outline methods, from early traditional non-neural methods to pre-trained model transfer. We also revisit the notion of domain, and we uncover a bias in the type of Natural Language Processing tasks which received most attention. Lastly, we outline future directions, particularly the broader need for out-of-distribution generalization of future NLP. 1

show abstract

“…Compared to such prior work, this paper presents two novel approaches to improve the language generalization of representation vectors based on multi-view alignment and OT. Finally, our work involves LANN that bears some similarity with DANN models in domain adaptation research of machine learning (Ganin et al, 2016;Bousmalis et al, 2016;Fu et al, 2017;Naik and Rose, 2020;. Compared to such work, our work explores a new dimension of adversarial networks for language-invariant representation learning for texts in ECR.…”

Section: Related Workmentioning

confidence: 99%

Learning Cross-lingual Representations for Event Coreference Resolution with Multi-view Alignment and Optimal Transport

Phung¹,

Tran²,

Nguyen³

et al. 2021

Proceedings of the 1st Workshop on Multilingual Representation Learning

View full text Add to dashboard Cite

We study a new problem of cross-lingual transfer learning for event coreference resolution (ECR) where models trained on data from a source language are adapted for evaluations in different target languages. We introduce the first baseline model for this task based on XLM-RoBERTa, a state-of-the-art multilingual pre-trained language model. We also explore language adversarial neural networks (LANN) that present language discriminators to distinguish texts from the source and target languages to improve the language generalization for ECR. In addition, we introduce two novel mechanisms to further enhance the general representation learning of LANN, featuring: (i) multi-view alignment to penalize cross coreference-label alignment of examples in the source and target languages, and (ii) optimal transport to select close examples in the source and target languages to provide better training signals for the language discriminators. Finally, we perform extensive experiments for cross-lingual ECR from English to Spanish and Chinese to demonstrate the effectiveness of the proposed methods.

show abstract

Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation

Cited by 21 publications

References 21 publications

Adapting Event Extractors to Medical Data: Bridging the Covariate Shift

Adapting Event Extractors to Medical Data: Bridging the Covariate Shift

Neural Unsupervised Domain Adaptation in NLP—A Survey

Learning Cross-lingual Representations for Event Coreference Resolution with Multi-view Alignment and Optimal Transport

Contact Info

Product

Resources

About