2019
DOI: 10.1109/tnnls.2019.2929141
|View full text |Cite
|
Sign up to set email alerts
|

Deep Reinforcement Learning for Sequence-to-Sequence Models

Abstract: In recent times, sequence-to-sequence (seq2seq) models have gained a lot of popularity and provide state-ofthe-art performance in a wide variety of tasks such as machine translation, headline generation, text summarization, speech to text conversion, and image caption generation. The underlying framework for all these models is usually a deep neural network comprising an encoder and a decoder. Although simple encoderdecoder models produce competitive results, many researchers have proposed additional improveme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
87
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 136 publications
(93 citation statements)
references
References 130 publications
0
87
0
Order By: Relevance
“…Shortcomings of maximum-likelihood training for sequence generation have often been discussed (Ding and Soricut, 2017;Leblond et al, 2018;Ranzato et al, 2016), but without pointing to generalization as the key aspect. An overview of recent deep reinforcement learning methods for conditional generation can be found in (Keneshloo et al, 2018). Our proposed approach follows work by Ding et al (2017) and Tan et al (2018) by employing both, policy and reward for exploration.…”
Section: Related Workmentioning
confidence: 99%
“…Shortcomings of maximum-likelihood training for sequence generation have often been discussed (Ding and Soricut, 2017;Leblond et al, 2018;Ranzato et al, 2016), but without pointing to generalization as the key aspect. An overview of recent deep reinforcement learning methods for conditional generation can be found in (Keneshloo et al, 2018). Our proposed approach follows work by Ding et al (2017) and Tan et al (2018) by employing both, policy and reward for exploration.…”
Section: Related Workmentioning
confidence: 99%
“…For the Pointer-Generator Network from See et al (2017), we follow their implementation 4 and use a batch size 16. For Paulus et al (2018), we use an implementation from Keneshloo et al (2018) 5 . We did not include the intra-temporal attention and the intra-decoder attention because they hurt the performance.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…Other information retrieval based algorithms [10], [55] Single-encoder based algorithms [31], [33], [41], [89] Multiple-encoder based algorithms [32], [80] Other algorithms [64], [68], [81] Note that in the process of collecting papers, we first performed two types of searches for related papers: (1) Online library search for papers containing keywords including "code + comment", "comment", "code + summary" and "summary" in the fields of title, abstract and index terms of the papers from ACM Digital Library, IEEE Xplore Digital Library, DBLP, Google Scholar and arXiv.org. (2) Specific search of major conference proceedings and journals in software engineering and artificial intelligence, including IEEE ICSE, IEEE FSE, IEEE/ACM ASE, IEEE TSE, ACM TOSEM, EMSE, AAAI and IJCAI.…”
Section: Trends Of the Development Of Code Commenting Techniquesmentioning
confidence: 99%
“…That is, they use two encoders in the classical encoder-decoder framework. Under this framework, they exploit reinforcement learning model to solve two issues: exposure bias and inconsistency between train and test measurement [81]. They leverage an actor network and a critic network to jointly determine the next best word at each time step.…”
Section: B) Multiple-encoder Based Comment Generation Algorithmsmentioning
confidence: 99%