Reinforced Video Captioning with Entailment Rewards

Pasunuru, Ramakanth; Bansal, Mohit

doi:10.18653/v1/d17-1103

Cited by 107 publications

(69 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Reinforcement Learning (RL) Loss Policy gradient methods can directly optimize discrete target evaluation metrics such as ROUGE that are non-differentiable (Paulus et al, 2018;Jaques et al, 2017;Pasunuru and Bansal, 2017;. At each time step, the word generated by the model can be viewed as an action taken by an RL agent.…”

Section: Mixed Objective Learningmentioning

confidence: 99%

Deep Communicating Agents for Abstractive Summarization

Çelikyılmaz

Bosselut

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

294

253

View full text Add to dashboard Cite

We present deep communicating agents in an encoder-decoder architecture to address the challenges of representing a long document for abstractive summarization. With deep communicating agents, the task of encoding a long text is divided across multiple collaborating agents, each in charge of a subsection of the input text. These encoders are connected to a single decoder, trained end-to-end using reinforcement learning to generate a focused and coherent summary. Empirical results demonstrate that multiple communicating encoders lead to a higher quality summary compared to several strong baselines, including those based on a single encoder or multiple non-communicating encoders.

show abstract

Section: Mixed Objective Learningmentioning

confidence: 99%

Deep Communicating Agents for Abstractive Summarization

Çelikyılmaz

Bosselut

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

294

253

View full text Add to dashboard Cite

show abstract

“…Reinforcement learning has been applied to a wide array of text generation tasks, including machine translation (Wu et al, 2016;Ranzato et al, 2015), text summarization (Paulus et al, 2018;, and image/video captioning (Rennie et al, 2017;Pasunuru and Bansal, 2017). These RL approaches lean on the REINFORCE algorithm (Williams, 1992), or its variants, to train a generative model towards a non-differentiable reward by minimizing the policy gradient loss.…”

Section: Reinforcement Learning For Text Generationmentioning

confidence: 99%

Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards

Chan

Chen

Wang

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

Generating keyphrases that summarize the main points of a document is a fundamental task in natural language processing. Although existing generative models are capable of predicting multiple keyphrases for an input document as well as determining the number of keyphrases to generate, they still suffer from the problem of generating too few keyphrases. To address this problem, we propose a reinforcement learning (RL) approach for keyphrase generation, with an adaptive reward function that encourages a model to generate both sufficient and accurate keyphrases. Furthermore, we introduce a new evaluation method that incorporates name variations of the ground-truth keyphrases using the Wikipedia knowledge base. Thus, our evaluation method can more robustly evaluate the quality of predicted keyphrases. Extensive experiments on five real-world datasets of different scales demonstrate that our RL approach consistently and significantly improves the performance of the state-of-the-art generative models with both conventional and new evaluation methods. Document: DCE MRI data analysis for cancer area classification. The paper aims at improving the support of medical researchers in the context of in-vivo cancer imaging… The proposed approach is based on a three-step procedure: i) robust feature extraction from raw time-intensity curves, ii) voxel segmentation, and iii) voxel classification based on a learning-by-example approach… Finally, in the third step, a support vector machine (SVM) is trained to classify voxels according to the labels obtained by the clustering phase…

show abstract

“…(12) Due to instability of adversarial training, we additionally include a cross entropy (CE) loss that ensures that the generator will explore an output space in a more stable manner and maintain its language model [40]. The final objective of G θ is a mixed loss function, a weighted combination of Cross-Entropy Loss (L CE ) optimizing the maximumlikelihood training objective and Adversarial Loss (L GAN ) with its gradient function defined in Equation 12:…”

Section: A Implementation Detailsmentioning

confidence: 99%

Adversarial Inference for Multi-Sentence Video Description

Park

Rohrbach

Darrell

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

While significant progress has been made in the image captioning task, video description is still in its infancy due to the complex nature of video data. Generating multisentence descriptions for long videos is even more challenging. Among the main issues are the fluency and coherence of the generated descriptions, and their relevance to the video. Recently, reinforcement and adversarial learning based methods have been explored to improve the image captioning models; however, both types of methods suffer from a number of issues, e.g. poor readability and high redundancy for RL and stability issues for GANs. In this work, we instead propose to apply adversarial techniques during inference, designing a discriminator which encourages better multi-sentence video description. In addition, we find that a multi-discriminator "hybrid" design, where each discriminator targets one aspect of a description, leads to the best results. Specifically, we decouple the discriminator to evaluate on three criteria: 1) visual relevance to the video, 2) language diversity and fluency, and 3) coherence across sentences. Our approach results in more accurate, diverse, and coherent multi-sentence video descriptions, as shown by automatic as well as human evaluation on the popular ActivityNet Captions dataset.

show abstract

Reinforced Video Captioning with Entailment Rewards

Cited by 107 publications

References 32 publications

Deep Communicating Agents for Abstractive Summarization

Deep Communicating Agents for Abstractive Summarization

Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards

Adversarial Inference for Multi-Sentence Video Description

Contact Info

Product

Resources

About