Kevin Duh scite author profile

Methods of deep neural networks (DNNs)have recently demonstrated superior performance on a number of natural language processing tasks. However, in most previous work, the models are learned based on either unsupervised objectives, which does not directly optimize the desired task, or singletask supervised objectives, which often suffer from insufficient training data. We develop a multi-task DNN for learning representations across multiple tasks, not only leveraging large amounts of cross-task data, but also benefiting from a regularization effect that leads to more general representations to help tasks in new domains. Our multi-task DNN approach combines tasks of multiple-domain classification (for query classification) and information retrieval (ranking for web search), and demonstrates significant gains over strong baselines in a comprehensive set of domain adaptation.

show abstract

AMR Parsing as Sequence-to-Graph Transduction

Zhang¹,

Ma²,

Duh³

et al. 2019

108

134

View full text Add to dashboard Cite

We propose an attention-based model that treats AMR parsing as sequence-to-graph transduction. Unlike most AMR parsers that rely on pre-trained aligners, external semantic resources, or data augmentation, our proposed parser is aligner-free, and it can be effectively trained with limited amounts of labeled AMR data. Our experimental results outperform all previously reported SMATCH scores, on both AMR 2.0 (76.3% F1 on LDC2017T10) and AMR 1.0 (70.2% F1 on LDC2014T12). Another View of ReentrancyAMR is a rooted, directed, and usually acyclic graph where nodes represent concepts, and labeled directed edges represent the relationships between them (see Figure 1 for an AMR example). The reason for AMR being a graph instead of a tree is that it allows reentrant semantic relations. For instance, in Figure 1(a) "victim" is both ARG0 and arXiv:1905.08704v2 [cs.CL]

show abstract

ESPnet-ST: All-in-One Speech Translation Toolkit

Inaguma¹,

Kiyono²,

Duh³

et al. 2020

117

112

View full text Add to dashboard Cite

We present ESPnet-ST, which is designed for the quick development of speech-to-speech translation systems in a single framework. ESPnet-ST is a new project inside end-toend speech processing toolkit, ESPnet, which integrates or newly implements automatic speech recognition, machine translation, and text-to-speech functions for speech translation. We provide all-in-one recipes including data pre-processing, feature extraction, training, and decoding pipelines for a wide range of benchmark datasets. Our reproducible results can match or even outperform the current state-of-the-art performances; these pretrained models are downloadable. The toolkit is publicly available at https://github. com/espnet/espnet.

show abstract

Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning

Gordon¹,

Duh²,

Andrews³

2020

184

107

View full text Add to dashboard Cite

Pre-trained feature extractors, such as BERT for natural language processing and VGG for computer vision, have become effective methods for improving deep learning models without requiring more labeled data. While effective, these feature extractors may be prohibitively large for some deployment scenarios. We explore weight pruning for BERT and ask: how does compression during pretraining affect transfer learning? We find that pruning affects transfer learning in three broad regimes. Low levels of pruning (30-40%) do not affect pre-training loss or transfer to downstream tasks at all. Medium levels of pruning increase the pre-training loss and prevent useful pre-training information from being transferred to downstream tasks. High levels of pruning additionally prevent models from fitting downstream datasets, leading to further degradation. Finally, we observe that finetuning BERT on a specific task does not improve its prunability. We conclude that BERT can be pruned once during pre-training rather than separately for each task without affecting performance.

show abstract

Ordinal Common-sense Inference

Zhang

Rudinger

Duh

et al. 2017

TACL

104

View full text Add to dashboard Cite

Humans have the capacity to draw commonsense inferences from natural language: various things that are likely but not certain to hold based on established discourse, and are rarely stated explicitly. We propose an evaluation of automated common-sense inference based on an extension of recognizing textual entailment: predicting ordinal human responses on the subjective likelihood of an inference holding in a given context. We describe a framework for extracting common-sense knowledge from corpora, which is then used to construct a dataset for this ordinal entailment task. We train a neural sequence-to-sequence model on this dataset, which we use to score and generate possible inferences. Further, we annotate subsets of previously established datasets via our ordinal annotation protocol in order to then analyze the distinctions between these and what we have constructed.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kevin Duh

Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval

AMR Parsing as Sequence-to-Graph Transduction

ESPnet-ST: All-in-One Speech Translation Toolkit

Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning

Ordinal Common-sense Inference

Contact Info

Product

Resources

About