A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models

Ji, Yangfeng; Haffari, Gholamreza; Eisenstein, Jacob

doi:10.18653/v1/n16-1037

Cited by 107 publications

(117 citation statements)

References 30 publications

(33 reference statements)

Supporting

Mentioning

117

Contrasting

Order By: Relevance

“…The Penn Tree Bank corpus provides discourse relation annotation between the spans of text. We used the preprocessed data by Ji et al (2016b), where the explicit discourse relations are mapped into a dummy relation. Our data splits are the same as those described in the baselines (Ji et al, 2016a,b).…”

Section: Contextual Language Modelmentioning

confidence: 99%

“…We compare our system with the Recurrent Neural Net (RNNLM) with LSTM unit (Ji et al, 2016a), the Document Contextual Lan- guage Model (DCLM) (Ji et al, 2016a) and the Discourse Relation Language Model (DRLM) (Ji et al, 2016b). The RNNLM's architecture is the same as that described in (Mikolov et al, 2013) with sigmoid non-linearity replaced by LSTM.…”

Section: Contextual Language Modelmentioning

confidence: 99%

See 1 more Smart Citation

The Context-Dependent Additive Recurrent Neural Net

Tran¹,

Lai²,

Haffari³

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

Self Cite

View full text Add to dashboard Cite

Contextual sequence mapping is one of the fundamental problems in Natural Language Processing. Instead of relying solely on the information presented in a text, the learning agents have access to a strong external signal given to assist the learning process. In this paper, we propose a novel family of Recurrent Neural Network unit: the Context-dependent Additive Recurrent Neural Network (CARNN) that is designed specifically to leverage this external signal. The experimental results on public datasets in the dialog problem (Babi dialog Task 6 and Frame), contextual language model (Switchboard and Penn Discourse Tree Bank) and question answering (TrecQA) show that our novel CARNN-based architectures outperform previous methods.

show abstract

Section: Contextual Language Modelmentioning

confidence: 99%

Section: Contextual Language Modelmentioning

confidence: 99%

The Context-Dependent Additive Recurrent Neural Net

Tran¹,

Lai²,

Haffari³

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

Self Cite

View full text Add to dashboard Cite

show abstract

“…Some works in DA classification treat each utterance as an independent instance (Julia et al, 2010;Gambäck et al, 2011), which leads to ignoring important long-range dependencies in the dialogue history. Other works have captured inter-utterance relationships using models such as Hidden Markov Models (HMMs) (Stolcke et al, 2000;Surendran and Levow, 2006) or Recurrent Neural Networks (RNNs) (Kalchbrenner and Blunsom, 2013;Ji et al, 2016), where RNNs have been particularly successful.…”

Section: Introductionmentioning

confidence: 99%

A Generative Attentional Neural Network Model for Dialogue Act Classification

Tran

Haffari

Zukerman

2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 2: Short Papers)

Self Cite

View full text Add to dashboard Cite

We propose a novel generative neural network architecture for Dialogue Act classification. Building upon the Recurrent Neural Network framework, our model incorporates a new attentional technique and a label-to-label connection for sequence learning, akin to Hidden Markov Models. Our experiments show that both of these innovations enable our model to outperform strong baselines for dialogue-act classification on the MapTask and Switchboard corpora. In addition, we analyse empirically the effectiveness of each of these innovations.

show abstract

“…There have been many works on DA classification applied to these two datasets; some focus on textual data (Kalchbrenner and Blunsom, 2013;Stolcke et al, 2000), while others explore speech data (Julia et al, 2010). The classification methods used can be broadly divided into instancebased methods (Julia et al, 2010;Gambäck et al, 2011) and sequence-labeling methods (Stolcke et al, 2000;Kalchbrenner and Blunsom, 2013;Ji et al, 2016;Shen and Lee, 2016;Tran et al, 2017). Instance-based methods treat each utterance as an independent data point, which allows the application of general machine learning models, such as Support Vector Machines.…”

Section: Introductionmentioning

confidence: 99%

“…Instance-based methods treat each utterance as an independent data point, which allows the application of general machine learning models, such as Support Vector Machines. Sequencelabeling methods include methods based on Hidden Markov Models (HMMs) (Stolcke et al, 2000) and neural networks (Kalchbrenner and Blunsom, 2013;Ji et al, 2016;Shen and Lee, 2016;Tran et al, 2017).…”

Section: Introductionmentioning

confidence: 99%

Preserving Distributional Information in Dialogue Act Classification

Tran

Zukerman

Haffari

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Self Cite

View full text Add to dashboard Cite

This paper introduces a novel training/decoding strategy for sequence labeling. Instead of greedily choosing a label at each time step, and using it for the next prediction, we retain the probability distribution over the current label, and pass this distribution to the next prediction. This approach allows us to avoid the effect of label bias and error propagation in sequence learning/decoding. Our experiments on dialogue act classification demonstrate the effectiveness of this approach. Even though our underlying neural network model is relatively simple, it outperforms more complex neural models, achieving state-of-the-art results on the MapTask and Switchboard corpora.

show abstract

A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models

Cited by 107 publications

References 30 publications

The Context-Dependent Additive Recurrent Neural Net

The Context-Dependent Additive Recurrent Neural Net

A Generative Attentional Neural Network Model for Dialogue Act Classification

Preserving Distributional Information in Dialogue Act Classification

Contact Info

Product

Resources

About