Learning to Selectively Learn for Weakly-supervised Paraphrase Generation

Ding, Kaize; Li, Dingcheng; Li, Alexander Hanbo; Fan, Xing; Guo, Chenlei; Liu, Yang; Liu, Huan

doi:10.18653/v1/2021.emnlp-main.480

Cited by 3 publications

(9 citation statements)

References 44 publications

(13 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In our experiments, we use the performance gain on both NDCG and Hit Rate as the reward for offline experiment, and use the change of simulated rating as the reward for online experiment. Following [11,54,57], the policy network for data augmentation is updated on a delayed reward received after feeding the generated augmented data to the recommender.…”

Section: Learning Augmentation Policymentioning

confidence: 99%

Learning to Augment for Casual User Recommendation

Wang

Chang

et al. 2022

Proceedings of the ACM Web Conference 2022

View full text Add to dashboard Cite

Users who come to recommendation platforms are heterogeneous in activity levels. There usually exists a group of core users who visit the platform regularly and consume a large body of content upon each visit, while others are casual users who tend to visit the platform occasionally and consume less each time. As a result, consumption activities from core users often dominate the training data used for learning. As core users can exhibit different activity patterns from casual users, recommender systems trained on historical user activity data usually achieve much worse performance on casual users than core users. To bridge the gap, we propose a model-agnostic framework L2Aug to improve recommendations for casual users through data augmentation, without sacrificing core user experience. L2Aug is powered by a data augmentor that learns to generate augmented interaction sequences, in order to fine-tune and optimize the performance of the recommendation system for casual users. On four real-world public datasets, L2Aug outperforms other treatment methods and achieves the best sequential recommendation performance for both casual and core users. We also test L2Aug in an online simulation environment with real-time feedback to further validate its efficacy, and showcase its flexibility in supporting different augmentation actions.

show abstract

Section: Learning Augmentation Policymentioning

confidence: 99%

Learning to Augment for Casual User Recommendation

Wang

Chang

et al. 2022

Proceedings of the ACM Web Conference 2022

View full text Add to dashboard Cite

show abstract

“…Compared to the conventional reinforcement learning methods which consider the generators as the policy models, our work models the policy as a meta learner to accomplish a data selection objective. Our work is mostly related to (Ding et al, 2021), but we adopt a very different reinforcement learning approach which is the key for effective selective learning.…”

Section: Related Workmentioning

confidence: 99%

“…Though such attempts have demonstrated certain efficacy in handling instance-wise feature selection, they only deal with non timeseries data in non NLP domains, while the focus of our work is to deal with noisy labeled pairs in paraphrase generation tasks. Our work is mostly related to the instance-level active data acquisition approaches (Yoon et al, 2020;Ding et al, 2021), which are mostly adopted under the circumstances of data efficient or cost-sensitive learning or when dealing with noisy data.…”

Section: Related Workmentioning

confidence: 99%

“…To overcome this issue, we adopt a reinforcement learning-based approach to learn effective selection policy without supervised signal. To this end, we extend the success of previous reinforcement learning-based approach for data selection (Ding et al, 2021). However, formulating a Markov decision process (MDP) for the paraphrase learning process is a non-trivial task.…”

Section: Introductionmentioning

confidence: 99%

“…However, formulating a Markov decision process (MDP) for the paraphrase learning process is a non-trivial task. In previous works, several important parts of their MDP formulation, such as the design of reward signal, are in need of further investigation (Yoon et al, 2019;Ding et al, 2021) and there also lacks in depth discussion on the challenge of solving the reinforcement learning problem. In this paper, we are motivated to extend this important line of using reinforcement learning to perform selective learning in weakly-supervised paraphrase generation problems and thus overcoming the data unattainable issue.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning to Selectively Learn for Weakly Supervised Paraphrase Generation with Model-based Reinforcement Learning

Yin¹,

Li²,

Li³

2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

Self Cite

View full text Add to dashboard Cite

Paraphrase generation is an important language generation task attempting to interpret user intents and systematically generate new phrases of identical meanings to the given ones. However, the effectiveness of paraphrase generation is constrained by the access to the golden labeled data pairs where both the amount and the quality of the training data pairs are restricted. In this paper, we propose a new weakly supervised paraphrase generation approach that extends the success of a recent work that leverages reinforcement learning (RL) for effective model training with data selection. While data selection is privileged for the target task which has noisy data, developing a reinforced selective learning regime faces several unresolved challenges. In this paper, we carry on important discussions about the above problem and present a new model that could partially overcome the discussed issues with a model-based planning feature and a reward normalization feature. We perform extensive evaluation on four weakly supervised paraphrase generation tasks where the results show that our method could significantly improve the state-of-the-art performance on the evaluation datasets.

show abstract

Closed-book Question Generation via Contrastive Learning

Dong,

Lu,

Wang

et al. 2023

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

View full text Add to dashboard Cite

Question Generation (QG) is a fundamental NLP task for many downstream applications.Recent studies on open-book QG, where supportive answer-context pairs are provided to models, have achieved promising progress. However, generating natural questions under a more practical closed-book setting that lacks these supporting documents still remains a challenge. In this work, we propose a new QG model for this closed-book setting that is designed to better understand the semantics of long-form abstractive answers and store more information in its parameters through contrastive learning and an answer reconstruction module. Through experiments, we validate the proposed QG model on both public datasets and a new WikiCQA dataset. Empirical results show that the proposed QG model outperforms baselines in both automatic evaluation and human evaluation. In addition, we show how to leverage the proposed model to improve existing question-answering systems. These results further indicate the effectiveness of our QG model for enhancing closed-book questionanswering tasks.

show abstract

Learning to Selectively Learn for Weakly-supervised Paraphrase Generation

Cited by 3 publications

References 44 publications

Learning to Augment for Casual User Recommendation

Learning to Augment for Casual User Recommendation

Learning to Selectively Learn for Weakly Supervised Paraphrase Generation with Model-based Reinforcement Learning

Closed-book Question Generation via Contrastive Learning

Contact Info

Product

Resources

About