ConRPG: Paraphrase Generation using Contexts as Regularizer

Meng, Yuxian; Ao, Xiang; He, Qianhua; Sun, Xiaofei; Han, Qinghong; Wu, Fei; Fan, Chun; Li, Jiwei

doi:10.18653/v1/2021.emnlp-main.199

Cited by 13 publications

(12 citation statements)

References 38 publications

(24 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Syntax-controlled paraphrase generation has seen significant recent interest, as a means to explicitly generate diverse surface forms with the same meaning. However, most previous work has required knowledge of the correct or valid surface forms to be generated (Iyyer et al, 2018;Chen et al, 2019a;Kumar et al, 2020;Meng et al, 2021). It is generally assumed that the input can be rewritten without addressing the problem of predicting which template should be used, which is necessary if the method is to be useful.…”

Section: Syntax-controlled Paraphrase Generationmentioning

confidence: 99%

“…While autoregressive models of language (including paraphrasing systems) predict one token at a time, there is evidence that in humans some degree of planning occurs at a higher level than individual words (Levelt, 1993;Martin et al, 2010). Prior work on paraphrase generation has attempted to include this inductive bias by specifying an alternative surface form as additional model input, either in the form of target parse trees (Iyyer et al, 2018;Chen et al, 2019a;Kumar et al, 2020), exemplars (Meng et al, 2021), or syntactic codes…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Hierarchical Sketch Induction for Paraphrase Generation

Hosking¹,

Tang²,

Lapata³

2022

Preprint

View full text Add to dashboard Cite

We propose a generative model of paraphrase generation, that encourages syntactic diversity by conditioning on an explicit syntactic sketch. We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings as a sequence of discrete latent variables that make iterative refinements of increasing granularity. This hierarchy of codes is learned through end-to-end training, and represents fine-to-coarse grained information about the input. We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time. Extensive experiments, including a human evaluation, confirm that HRQ-VAE learns a hierarchical representation of the input space, and generates paraphrases of higher quality than previous systems.

show abstract

Section: Syntax-controlled Paraphrase Generationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Hierarchical Sketch Induction for Paraphrase Generation

Hosking¹,

Tang²,

Lapata³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Similarly, CIP emphasizes rephrasing the idioms of input sentences to word segments that reflect more intuitive and understandable paraphrasing. In recent decades, many researchers devoted to paraphrase generation [8], [9] are struggled due to the lack of reliable supervision dataset [10]. Inspired by the challenge, we establish a large-scale training dataset in this work for CIP task.…”

Section: Tablementioning

confidence: 99%

“…A long-standing issue embraced in paraphrase generation studies is the lack of reliable supervised datasets. The issue can be avoided by constructing manually annotated paired-paraphrase datasets [6] or designing unsupervised paraphrase generation methods [10]. Differ from existing paraphrase generation research, we take our attention to Chinese idiom paraphrasing that rephrases idiom-included sentences to non-idiom-included ones.…”

Section: Related Workmentioning

confidence: 99%

Chinese Idiom Paraphrasing

Qiang¹,

Li²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Idioms, are a kind of idiomatic expression in Chinese, most of which consist of four Chinese characters. Due to the properties of non-compositionality and metaphorical meaning, Chinese Idioms are hard to be understood by children and non-native speakers. This study proposes a novel task, denoted as Chinese Idiom Paraphrasing (CIP). CIP aims to rephrase idioms-included sentences to non-idiomatic ones under the premise of preserving the original sentence's meaning. Since the sentences without idioms are easier handled by Chinese NLP systems, CIP can be used to pre-process Chinese datasets, thereby facilitating and improving the performance of Chinese NLP tasks, e.g., machine translation system, Chinese idiom cloze, and Chinese idiom embeddings. In this study, CIP task is treated as a special paraphrase generation task. To circumvent difficulties in acquiring annotations, we first establish a large-scale CIP dataset based on human and machine collaboration, which consists of 115,530 sentence pairs. We further deploy three baselines and two novel CIP approaches to deal with CIP problems. The results show that the proposed methods have better performances than the baselines based on the established CIP dataset.

show abstract

“…From another perspective that is not directly related to our work, lexical overlap features are also beneficial to paraphrase generation task. While the quality of generated paraphrases can be decided by state-of-the-art models like Sentence-BERT (Reimers and Gurevych 2019) shown in a recent work (Corbeil and Abdi Ghavidel 2021) for data augmentation, some works still consider lexical overlap features as criteria: Nighojkar and Licato (2021) use BLEURT (Sellam, Das, and Parikh 2020) metric to calculate reward for sentence pairs that are mutually implicative but lexically and syntactically disparate; Kadotani et al (2021) use edit distance to decide whether source and target sentences require drastic transformation, so that the training order of curriculum learning (Bengio et al 2009) can be determined for better performance of paraphrase generation; Jaccard distance is used in Meng et al (2021)’s work as one metric for filtering generated paraphrase candidates.

Figure 3. PAN.

Figure 4. PAWS-wiki. …”

Section: Related Workmentioning

confidence: 99%

Parameter-efficient feature-based transfer for paraphrase identification

2022

View full text Add to dashboard Cite

There are many types of approaches for Paraphrase Identification (PI), an NLP task of determining whether a sentence pair has equivalent semantics. Traditional approaches mainly consist of unsupervised learning and feature engineering, which are computationally inexpensive. However, their task performance is moderate nowadays. To seek a method that can preserve the low computational costs of traditional approaches but yield better task performance, we take an investigation into neural network-based transfer learning approaches. We discover that by improving the usage of parameters efficiently for feature-based transfer, our research goal can be accomplished. Regarding the improvement, we propose a pre-trained task-specific architecture. The fixed parameters of the pre-trained architecture can be shared by multiple classifiers with small additional parameters. As a result, the computational cost left involving parameter update is only generated from classifier-tuning: the features output from the architecture combined with lexical overlap features are fed into a single classifier for tuning. Furthermore, the pre-trained task-specific architecture can be applied to natural language inference and semantic textual similarity tasks as well. Such technical novelty leads to slight consumption of computational and memory resources for each task and is also conducive to power-efficient continual learning. The experimental results show that our proposed method is competitive with adapter-BERT (a parameter-efficient fine-tuning approach) over some tasks while consuming only 16% trainable parameters and saving 69-96% time for parameter update.

show abstract

ConRPG: Paraphrase Generation using Contexts as Regularizer

Cited by 13 publications

References 38 publications

Hierarchical Sketch Induction for Paraphrase Generation

Hierarchical Sketch Induction for Paraphrase Generation

Chinese Idiom Paraphrasing

Parameter-efficient feature-based transfer for paraphrase identification

Contact Info

Product

Resources

About