Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence 2018
DOI: 10.24963/ijcai.2018/606
|View full text |Cite
|
Sign up to set email alerts
|

Toward Diverse Text Generation with Inverse Reinforcement Learning

Abstract: Text generation is a crucial task in NLP. Recently, several adversarial generative models have been proposed to improve the exposure bias problem in text generation. Though these models gain great success, they still suffer from the problems of reward sparsity and mode collapse. In order to address these two problems, in this paper, we employ inverse reinforcement learning (IRL) for text generation. Specifically, the IRL framework learns a reward function on training data, and then an optimal policy to maximum… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
58
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 75 publications
(62 citation statements)
references
References 0 publications
1
58
0
Order By: Relevance
“…This may be the reason that the countermeasures could easily detect fake reviews. To generate more robust reviews, we plan to develop a method that generates reviews with more diversity [35]. We also plan to develop a countermeasure for detecting these generated reviews.…”
Section: Discussionmentioning
confidence: 99%
“…This may be the reason that the countermeasures could easily detect fake reviews. To generate more robust reviews, we plan to develop a method that generates reviews with more diversity [35]. We also plan to develop a countermeasure for detecting these generated reviews.…”
Section: Discussionmentioning
confidence: 99%
“…As for the discriminator, RankGAN (Lin et al, 2017) replaced traditional discriminator with a ranker to learn the relative ranking information between the real texts and generated ones. Inverse reinforcement learning (Shi et al, 2018) used a trainable reward approximator as the discriminator to provide dense reward signals at each generation step. DPGAN ) introduced a language model based discriminator and regarded cross-entropy as rewards to promote the diversity of generation results.…”
Section: Related Workmentioning
confidence: 99%
“…MaliGAN: A variant of SeqGAN that optimizes the generator with a normalized maximum likelihood objective (Che et al, 2017). IRL: This inverse reinforcement learning method replaces the discriminator with a reward approximator to provide dense rewards (Shi et al, 2018). RAML: A RL approach to incorporate MLE objective into RL training framework, which regards BLEU as rewards (Norouzi et al, 2016).…”
Section: Baselinesmentioning
confidence: 99%
See 1 more Smart Citation
“…This method enables us to both make use of an efficient adversarial formulation and recover a more precise reward function for open-domain dialogue training. Unlike Shi et al (2018), we design a specific reward function structure to measure the reward of each word in generated sentences while taking account of the dialogue context. We also consider two human evaluation settings to assess the overall performance of our model.…”
Section: Introductionmentioning
confidence: 99%