Asa Cooper Stickland scite author profile

Asa Cooper Stickland

5Publications

68Citation Statements Received

62Citation Statements Given

How they've been cited

How they cite others

Affiliations

China Datang Corporation (China), University of Edinburgh

Publications

Order By: Most citations

Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation

Stickland

Ghazvininejad

2021

View full text Add to dashboard Cite

There has been recent success in pre-training on monolingual data and fine-tuning on Machine Translation (MT), but it remains unclear how to best leverage a pre-trained model for a given MT task. This paper investigates the benefits and drawbacks of freezing parameters, and adding new ones, when fine-tuning a pre-trained model on MT. We focus on 1) Fine-tuning a model trained only on English monolingual data, BART. 2) Fine-tuning a model trained on monolingual data from 25 languages, mBART. For BART we get the best performance by freezing most of the model parameters, and adding extra positional embeddings. For mBART we match or outperform the performance of naive fine-tuning for most language pairs with the encoder, and most of the decoder, frozen. The encoder-decoder attention parameters are most important to finetune. When constraining ourselves to an outof-domain training set for Vietnamese to English we see the largest improvements over the fine-tuning baseline.

show abstract

BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

Stickland¹,

Murray²

2019

Preprint

View full text Add to dashboard Cite

Multi-task learning shares information between related tasks, sometimes reducing the number of parameters required. State-of-the-art results across multiple natural language understanding tasks in the GLUE benchmark have previously used transfer from a single large task: unsupervised pre-training with BERT, where a separate BERT model was fine-tuned for each task. We explore multi-task approaches that share a single BERT model with a small number of additional task-specific parameters. Using new adaptation modules, PALs or 'projected attention layers', we match the performance of separately finetuned models on the GLUE benchmark with ≈7 times fewer parameters, and obtain state-of-theart results on the Recognizing Textual Entailment dataset.

show abstract

Diverse Ensembles Improve Calibration

Stickland¹,

Murray²

2020

Preprint

View full text Add to dashboard Cite

Modern deep neural networks can produce badly calibrated predictions, especially when train and test distributions are mismatched. Training an ensemble of models and averaging their predictions can help alleviate these issues. We propose a simple technique to improve calibration, using a different data augmentation for each ensemble member. We additionally use the idea of 'mixing' un-augmented and augmented inputs to improve calibration when test and training distributions are the same. These simple techniques improve calibration and accuracy over strong baselines on the CIFAR10 and CIFAR100 benchmarks, and out-of-domain data from their corrupted versions.

show abstract

Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters

Stickland¹,

Bérard²,

Nikoulina³

2021

Preprint

View full text Add to dashboard Cite

Robustification of Multilingual Language Models to Real-world Noise in Crosslingual Zero-shot Settings with Robust Contrastive Pretraining

Stickland¹,

Sengupta²,

Krone³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.