Yangfeng Ji scite author profile

We present a novel response generation system that can be trained end to end on large quantities of unstructured Twitter conversations. A neural network architecture is used to address sparsity issues that arise when integrating contextual information into classic statistical models, allowing the system to take into account previous dialog utterances. Our dynamic-context generative models show consistent gains over both context-sensitive and non-context-sensitive Machine Translation and Information Retrieval baselines.

show abstract

Representation Learning for Text-level Discourse Parsing

Ji¹,

Eisenstein²

2014

214

255

View full text Add to dashboard Cite

Text-level discourse parsing is notoriously difficult, as distinctions between discourse relations require subtle semantic judgments that are not easily captured using standard features. In this paper, we present a representation learning approach, in which we transform surface features into a latent space that facilitates RST discourse parsing. By combining the machinery of large-margin transition-based structured prediction with representation learning, our method jointly learns to parse discourse while at the same time learning a discourse-driven projection of surface features. The resulting shift-reduce discourse parser obtains substantial improvements over the previous state-of-the-art in predicting relations and nuclearity on the RST Treebank.

show abstract

A Neural Network Approach to Context-Sensitive Generation of Conversational Responses

Sordoni¹,

Galley²,

Auli³

et al. 2015

Preprint

127

View full text Add to dashboard Cite

Neural Discourse Structure for Text Categorization

Ji¹,

Smith²

2017

122

View full text Add to dashboard Cite

We show that discourse structure, as defined by Rhetorical Structure Theory and provided by an existing discourse parser, benefits text categorization. Our approach uses a recursive neural network and a newly proposed attention mechanism to compute a representation of the text that focuses on salient content, from the perspective of both RST and the task. Experiments consider variants of the approach and illustrate its strengths and weaknesses.

show abstract

Better Document-level Sentiment Analysis from RST Discourse Parsing

Bhatia¹,

Ji²,

Eisenstein³

2015

126

123

View full text Add to dashboard Cite

Discourse parsing is an integral part of understanding information flow and argumentative structure in documents. Most previous research has focused on inducing and evaluating models from the English RST Discourse Treebank. However, discourse treebanks for other languages exist , including Spanish, German, Basque, Dutch and Brazilian Portuguese. The tree-banks share the same underlying linguistic theory, but differ slightly in the way documents are annotated. In this paper, we present (a) a new discourse parser which is simpler, yet competitive (significantly better on 2/3 metrics) to state of the art for English, (b) a harmonization of discourse treebanks across languages, enabling us to present (c) what to the best of our knowledge are the first experiments on cross-lingual discourse parsing.

show abstract

A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models

Ji¹,

Haffari²,

Eisenstein³

2016

107

117

View full text Add to dashboard Cite

This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model can therefore employ a training objective that includes not only discourse relation classification, but also word prediction. As a result, it outperforms state-ofthe-art alternatives for two tasks: implicit discourse relation classification in the Penn Discourse Treebank, and dialog act classification in the Switchboard corpus. Furthermore, by marginalizing over latent discourse relations at test time, we obtain a discourse informed language model, which improves over a strong LSTM baseline.

show abstract

One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations

Eisenstein

2015

TACL

113

View full text Add to dashboard Cite

Discourse relations bind smaller linguistic units into coherent texts. However, automatically identifying discourse relations is difficult, because it requires understanding the semantics of the linked arguments. A more subtle challenge is that it is not enough to represent the meaning of each argument of a discourse relation, because the relation may depend on links between lower-level components, such as entity mentions. Our solution computes distributional meaning representations by composition up the syntactic parse tree. A key difference from previous work on compositional distributional semantics is that we also compute representations for entity mentions, using a novel downward compositional pass. Discourse relations are predicted from the distributional representations of the arguments, and also of their coreferent entity mentions. The resulting system obtains substantial improvements over the previous state-of-theart in predicting implicit discourse relations in the Penn Discourse Treebank.

show abstract

Dynamic Entity Representations in Neural Language Models

Tan

Martschat

et al. 2017

105

View full text Add to dashboard Cite

Understanding a long document requires tracking how entities are introduced and evolve over time. We present a new type of language model, ENTITYNLM, that can explicitly model entities, dynamically update their representations, and contextually generate their mentions. Our model is generative and flexible; it can model an arbitrary number of entities in context while generating each entity mention at an arbitrary length. In addition, it can be used for several different tasks such as language modeling, coreference resolution, and entity prediction. Experimental results with all these tasks demonstrate that our model consistently outperforms strong baselines and prior work.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yangfeng Ji

A Neural Network Approach to Context-Sensitive Generation of Conversational Responses

Representation Learning for Text-level Discourse Parsing

A Neural Network Approach to Context-Sensitive Generation of Conversational Responses

Neural Discourse Structure for Text Categorization

Better Document-level Sentiment Analysis from RST Discourse Parsing

A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models

One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations

Dynamic Entity Representations in Neural Language Models

Contact Info

Product

Resources

About