Neural Discourse Structure for Text Categorization

Ji, Yangfeng; Smith, Noah A.

doi:10.18653/v1/p17-1092

Cited by 97 publications

(126 citation statements)

References 31 publications

(48 reference statements)

Supporting

Mentioning

122

Contrasting

Order By: Relevance

“…Only a few works have attempted to parse discourse relations for out-of-domain problems such as text categorizations on social media texts; Ji and Bhatia used models which are pretrained with RST DT for building discourse structures from movie reviews, and Son adapted the PDTB discourse re-lation parsing approach for capturing counterfactual conditionals from tweets (Bhatia et al, 2015;Ji and Smith, 2017;. These works had substantial differences to what propose in this paper.…”

Section: Related Workmentioning

confidence: 97%

“…Researchers who, instead, tried to build the end-to-end parsing pipelines considered a wider range of approaches including sequence models and RNNs (Biran and McKeown, 2015;Feng and Hirst, 2014;Ji and Eisenstein, 2014;Li et al, 2014). Particularly, when they tried to utilize the discourse structures for out-domain applications, they used RNNbased models and found that those models are advantageous for their downstream tasks (Bhatia et al, 2015;Ji and Smith, 2017).…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Causal Explanation Analysis on Social Media

Son

Bayas²,

Schwartz

2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Understanding causal explanations -reasons given for happenings in one's life -has been found to be an important psychological factor linked to physical and mental health. Causal explanations are often studied through manual identification of phrases over limited samples of personal writing. Automatic identification of causal explanations in social media, while challenging in relying on contextual and sequential cues, offers a larger-scale alternative to expensive manual ratings and opens the door for new applications (e.g. studying prevailing beliefs about causes, such as climate change). Here, we explore automating causal explanation analysis, building on discourse parsing, and presenting two novel subtasks: causality detection (determining whether a causal explanation exists at all) and causal explanation identification (identifying the specific phrase that is the explanation). We achieve strong accuracies for both tasks but find different approaches best: an SVM for causality prediction (F 1 = 0.791) and a hierarchy of Bidirectional LSTMs for causal explanation identification (F 1 = 0.853). Finally, we explore applications of our complete pipeline (F 1 = 0.868), showing demographic differences in mentions of causal explanation and that the association between a word and sentiment can change when it is used within a causal explanation.

show abstract

Section: Related Workmentioning

confidence: 97%

Section: Related Workmentioning

confidence: 99%

Causal Explanation Analysis on Social Media

Son

Bayas²,

Schwartz

2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…Weighted sentences can be aggregated using average pooling or max pooling. Prior work (Jiang et al, 2016;Ji and Smith, 2017) have explored some of these combinations but not all of them. In the ablation experiments, we try all combinations and we find that the (sigmoid, max pooling) attention gives the best result.…”

Section: Bag Encoder Architecturementioning

confidence: 99%

Combining Distant and Direct Supervision for Neural Relation Extraction

Beltagy¹,

Lo²,

Ammar³

2019

Proceedings of the 2019 Conference of the North

View full text Add to dashboard Cite

In relation extraction with distant supervision, noisy labels make it difficult to train quality models. Previous neural models addressed this problem using an attention mechanism that attends to sentences that are likely to express the relations. We improve such models by combining the distant supervision data with an additional directly-supervised data, which we use as supervision for the attention weights. We find that joint training on both types of supervision leads to a better model because it improves the model's ability to identify noisy sentences. In addition, we find that sigmoidal attention weights with max pooling achieves better performance over the commonly used weighted average attention in this setup. Our proposed method 1 achieves a new state-of-theart result on the widely used FB-NYT dataset.

show abstract

“…Understanding a document's discourse-level organization is important for correctly interpreting it, and discourse analyses have been shown to be helpful for several NLP tasks (Bhatia et al, 2015;Ji and Smith, 2017;Feng and Hirst, 2014b;Ferracane et al, 2017). A popular formalism for discourse analysis is Rhetorical Structure Theory (RST) (Mann and Thompson, 1988) (Fig.…”

Section: Introductionmentioning

confidence: 99%

Neural Generative Rhetorical Structure Parsing

Mabona¹,

Rimell²,

Clark³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Rhetorical structure trees have been shown to be useful for several document-level tasks including summarization and document classification. Previous approaches to RST parsing have used discriminative models; however, these are less sample efficient than generative models, and RST parsing datasets are typically small. In this paper, we present the first generative model for RST parsing. Our model is a document-level RNN grammar (RNNG) with a bottom-up traversal order. We show that, for our parser's traversal order, previous beam search algorithms for RNNGs have a left-branching bias which is ill-suited for RST parsing. We develop a novel beam search algorithm that keeps track of both structureand word-generating actions without exhibiting this branching bias and results in absolute improvements of 6.8 and 2.9 on unlabelled and labelled F1 over previous algorithms. Overall, our generative model outperforms a discriminative model with the same features by 2.6 F1 points and achieves performance comparable to the state-of-the-art, outperforming all published parsers from a recent replication study that do not use additional training data. ✠ JUSTIFY Acme Inc. has closed several widget factories. | ATTRIBUTION The CEO told investors they were no longer profitable.

show abstract

Neural Discourse Structure for Text Categorization

Cited by 97 publications

References 31 publications

Causal Explanation Analysis on Social Media

Causal Explanation Analysis on Social Media

Combining Distant and Direct Supervision for Neural Relation Extraction

Neural Generative Rhetorical Structure Parsing

Contact Info

Product

Resources

About