Huda Khayrallah scite author profile

Continued training is an effective method for domain adaptation in neural machine translation. However, in-domain gains from adaptation come at the expense of general-domain performance. In this work, we interpret the drop in general-domain performance as catastrophic forgetting of general-domain knowledge. To mitigate it, we adapt Elastic Weight Consolidation (EWC)-a machine learning method for learning a new task without forgetting previous tasks. Our method retains the majority of general-domain performance lost in continued training without degrading indomain performance, outperforming the previous state-of-the-art. We also explore the full range of general-domain performance available when some in-domain degradation is acceptable. 1 See Cadwell et al. (2018) and Porro Rodriguez et al. (2017) for discussions about lack of trust in MT.

show abstract

Deep Generalized Canonical Correlation Analysis

Benton¹,

Khayrallah²,

Gujral³

et al. 2019

View full text Add to dashboard Cite

We present Deep Generalized Canonical Correlation Analysis (DGCCA) -a method for learning nonlinear transformations of arbitrarily many views of data, such that the resulting transformations are maximally informative of each other. While methods for nonlinear two-view representation learning (Deep CCA, (Andrew et al., 2013)) and linear many-view representation learning (Generalized CCA (Horst, 1961)) exist, DGCCA is the first CCA-style multiview representation learning technique that combines the flexibility of nonlinear (deep) representation learning with the statistical power of incorporating information from many independent sources, or views. We present the DGCCA formulation as well as an efficient stochastic optimization algorithm for solving it. We learn DGCCA representations on two distinct datasets for three downstream tasks: phonetic transcription from acoustic and articulatory measurements, and recommending hashtags and friends on a dataset of Twitter users. We find that DGCCA representations soundly beat existing methods at phonetic transcription and hashtag recommendation, and in general perform no worse than standard linear many-view techniques.

show abstract

Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting

Hu¹,

Khayrallah²,

Culkin³

et al. 2019

View full text Add to dashboard Cite

Lexically-constrained sequence decoding allows for explicit positive or negative phrasebased constraints to be placed on target output strings in generation tasks such as machine translation or monolingual text rewriting. We describe vectorized dynamic beam allocation, which extends work in lexically-constrained decoding to work with batching, leading to a five-fold improvement in throughput when working with positive constraints. Faster decoding enables faster exploration of constraint strategies: we illustrate this via data augmentation experiments with a monolingual rewriter applied to the tasks of natural language inference, question answering and machine translation, showing improvements in all three.

show abstract

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

Koehn¹,

Khayrallah²,

Heafield³

et al. 2018

101

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Huda Khayrallah

On the Impact of Various Types of Noise on Neural Machine Translation

Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation

Deep Generalized Canonical Correlation Analysis

Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

Contact Info

Product

Resources

About