Recent Advances on Neural Headline Generation

Ayana,; Shen, Suhung; Lin, Yankai; Tu, Cunchao; Yu, Zhiwu; Liu, Zhiyuan; Sun, Maosong

doi:10.1007/s11390-017-1758-3

Cited by 41 publications

(15 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…More recently, our approach has been successfully applied to summarization (Ayana et al, 2016). They optimize neural networks for headline generation with respect to ROUGE (Lin, 2004) and also achieve significant improvements, confirming the effectiveness and applicability of our approach.…”

Section: Related Worksupporting

confidence: 66%

Minimum Risk Training for Neural Machine Translation

Shen¹,

Cheng²,

He³

et al. 2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

332

340

View full text Add to dashboard Cite

We propose minimum risk training for end-to-end neural machine translation. Unlike conventional maximum likelihood estimation, minimum risk training is capable of optimizing model parameters directly with respect to arbitrary evaluation metrics, which are not necessarily differentiable. Experiments show that our approach achieves significant improvements over maximum likelihood estimation on a state-of-the-art neural machine translation system across various languages pairs. Transparent to architectures, our approach can be applied to more neural networks and potentially benefit more NLP tasks.

show abstract

Section: Related Worksupporting

confidence: 66%

Minimum Risk Training for Neural Machine Translation

Shen¹,

Cheng²,

He³

et al. 2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

332

340

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 99%

“…Optimization methods for optimizing a model with respect to evaluation scores, such as reinforcement learning (Ranzato et al, 2015;Paulus et al, 2018;Chen and Bansal, 2018;Wu and Hu, 2018) and minimum risk training (Ayana et al, 2017), have been proposed for summarization models based on neural encoder-decoders. Our method is similar to that of Ayana et al (2017) in terms of applying MRT to neural encoder-decoders. There are two differences between our method and Ayana et al's: (i) our method uses only the part of the summary generated by a model within the length constraint for calculating the ROUGE score and (ii) it penalizes summaries that exceed the length of the reference regardless of its ROUGE score.…”

Section: Related Workmentioning

confidence: 99%

“…MRT (Och, 2003) is used to optimize a model globally for an arbitrary evaluation metric. It was also applied for optimizing the neural summarization model for headline generation with respect to ROUGE (Ayana et al, 2017), which is based on an overlap of words with reference summaries (Lin, 2004). However, how to use MRT under a length constraint was an open problem; thus we propose a global optimization under length constraint (GOLC) for neural summarization models.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Global Optimization under Length Constraint for Neural Text Summarization

Makino¹,

Iwakura²,

Takamura³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

We propose a global optimization method under length constraint (GOLC) for neural text summarization models. GOLC increases the probabilities of generating summaries that have high evaluation scores, ROUGE in this paper, within a desired length. We compared GOLC with two optimization methods, a maximum log-likelihood and a minimum risk training, on CNN/Daily Mail and a Japanese single document summarization data set of The Mainichi Shimbun Newspapers. The experimental results show that a state-ofthe-art neural summarization model optimized with GOLC generates fewer overlength summaries while maintaining the fastest processing speed; only 6.70% overlength summaries on CNN/Daily and 7.8% on long summary of Mainichi, compared to the approximately 20% to 50% on CNN/Daily Mail and 10% to 30% on Mainichi with the other optimization methods. We also demonstrate the importance of the generation of in-length summaries for post-editing with the dataset Mainich that is created with strict length constraints. The experimental results show approximately 30% to 40% improved post-editing time by use of inlength summaries.

show abstract

“…Recent success in deep learning, especially encoder-decoder models (Sutskever et al, 2014;Bahdanau et al, 2015), has dramatically improved the performance of various text-generation tasks, such as translation (Johnson et al, 2017), summarization (Ayana et al, 2017), question-answering (Choi et al, 2017), and dialogue response generation (Dhingra et al, 2017). In these studies on neural text generation, it has been known that a modelensemble method, which predicts output text by averaging multiple text-generation models at decoding time, is effective even for text-generation tasks, and many state-of-the-art results have been obtained with ensemble models.…”

Section: Introductionmentioning

confidence: 99%

Frustratingly Easy Model Ensemble for Abstractive Summarization

Kobayashi

2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Ensemble methods, which combine multiple models at decoding time, are now widely known to be effective for text-generation tasks. However, they generally increase computational costs, and thus, there have been many studies on compressing or distilling ensemble models. In this paper, we propose an alternative, simple but effective unsupervised ensemble method, post-ensemble, that combines multiple models by selecting a majority-like output in post-processing. We theoretically prove that our method is closely related to kernel density estimation based on the von Mises-Fisher kernel. Experimental results on a newsheadline-generation task show that the proposed method performs better than the current ensemble methods.

show abstract

Recent Advances on Neural Headline Generation

Cited by 41 publications

References 25 publications

Minimum Risk Training for Neural Machine Translation

Minimum Risk Training for Neural Machine Translation

Global Optimization under Length Constraint for Neural Text Summarization

Frustratingly Easy Model Ensemble for Abstractive Summarization

Contact Info

Product

Resources

About