Andrew Drozdov scite author profile

We introduce deep inside-outside recursive autoencoders (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree. Our approach predicts each word in an input sentence conditioned on the rest of the sentence and uses inside-outside dynamic programming to consider all possible binary trees over the sentence. At test time the CKY algorithm extracts the highest scoring parse. DIORA achieves a new state-of-the-art F1 in unsupervised binary constituency parsing (unlabeled) in two benchmark datasets, WSJ and MultiNLI.

show abstract

Do latent tree learning models identify meaningful structure in sentences?

Williams

Drozdov

Bowman

2018

TACL

124

View full text Add to dashboard Cite

Recent work on the problem of latent tree learning has made it possible to train neural networks that learn to both parse a sentence and use the resulting parse to interpret the sentence, all without exposure to groundtruth parse trees at training time. Surprisingly, these models often perform better at sentence understanding tasks than models that use parse trees from conventional parsers. This paper aims to investigate what these latent tree learning models learn. We replicate two such models in a shared codebase and find that (i) only one of these models outperforms conventional tree-structured models on sentence classification, (ii) its parsing strategies are not especially consistent across random restarts, (iii) the parses it produces tend to be shallower than standard Penn Treebank (PTB) parses, and (iv) they do not resemble those of PTB or any other semantic or syntactic formalism that the authors are aware of.

show abstract

Unsupervised Parsing with S-DIORA: Single Tree Encoding for Deep Inside-Outside Recursive Autoencoders

Drozdov¹,

Rongali²,

Chen³

et al. 2020

View full text Add to dashboard Cite

The deep inside-outside recursive autoencoder (DIORA;Drozdov et al. 2019a) is a selfsupervised neural model that learns to induce syntactic tree structures for input sentences without access to labeled training data. In this paper, we discover that while DIORA exhaustively encodes all possible binary trees of a sentence with a soft dynamic program, its vector averaging approach is locally greedy and cannot recover from errors when computing the highest scoring parse tree in bottom-up chart parsing. To fix this issue, we introduce S-DIORA, an improved variant of DIORA that encodes a single tree rather than a softlyweighted mixture of trees by employing a hard argmax operation and a beam at each cell in the chart. Our experiments show that through fine-tuning a pre-trained DIORA with our new algorithm, we improve the state of the art in unsupervised constituency parsing on the English WSJ Penn Treebank by 2.2 6% F1, depending on the data used for fine-tuning.

show abstract

Unsupervised Labeled Parsing with Deep Inside-Outside Recursive Autoencoders

Drozdov

Verga

Chen³

et al. 2019

View full text Add to dashboard Cite

Understanding text often requires identifying meaningful constituent spans such as noun phrases and verb phrases. In this work, we show that we can effectively recover these types of labels using the learned phrase vectors from deep inside-outside recursive autoencoders (DIORA). Specifically, we cluster span representations to induce span labels. Additionally, we improve the model's labeling accuracy by integrating latent code learning into the training procedure. We evaluate this approach empirically through unsupervised labeled constituency parsing. Our method outperforms ELMo and BERT on two versions of the Wall Street Journal (WSJ) dataset and is competitive to prior work that requires additional human annotations, improving over a previous state-of-the-art system that depends on ground-truth part-of-speech tags by 5 absolute F1 points (19% relative error reduction).

show abstract

Emergent Communication in a Multi-Modal, Multi-Step Referential Game

Evtimova

Drozdov

Kiela

et al. 2017

Preprint

View full text Add to dashboard Cite

Inspired by previous work on emergent communication in referential games, we propose a novel multi-modal, multi-step referential game, where the sender and receiver have access to distinct modalities of an object, and their information exchange is bidirectional and of arbitrary duration. The multi-modal multi-step setting allows agents to develop an internal communication significantly closer to natural language, in that they share a single set of messages, and that the length of the conversation may vary according to the difficulty of the task. We examine these properties empirically using a dataset consisting of images and textual descriptions of mammals, where the agents are tasked with identifying the correct object. Our experiments indicate that a robust and efficient communication protocol emerges, where gradual information exchange informs better predictions and higher communication bandwidth improves generalization. INTRODUCTIONRecently, there has been a surge of work on neural network-based multi-agent systems that are capable of communicating with each other in order to solve a problem. Two distinct lines of research can be discerned. In the first one, communication is used as an essential tool for sharing information among multiple active agents in a reinforcement learning scenario (

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Andrew Drozdov

Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Auto-Encoders

Do latent tree learning models identify meaningful structure in sentences?

Unsupervised Parsing with S-DIORA: Single Tree Encoding for Deep Inside-Outside Recursive Autoencoders

Unsupervised Labeled Parsing with Deep Inside-Outside Recursive Autoencoders

Emergent Communication in a Multi-Modal, Multi-Step Referential Game

Contact Info

Product

Resources

About