Paraphrase generation aims to improve the clarity of a sentence by using different wording that convey similar meaning. For better quality of generated paraphrases, we propose a framework that combines the effectiveness of two models -transformer and sequence-tosequence (seq2seq). We design a two-layer stack of encoders. The first layer is a transformer model containing 6 stacked identical layers with multi-head self-attention, while the second-layer is a seq2seq model with gated recurrent units (GRU-RNN). The transformer encoder layer learns to capture long-term dependencies, together with syntactic and semantic properties of the input sentence. This rich vector representation learned by the transformer serves as input to the GRU-RNN encoder responsible for producing the state vector for decoding. Experimental results on two datasets-QUORA and MSCOCO using our framework, produces a new benchmark for paraphrase generation.
We propose a system that improves performance on single document summarization task using the CNN/DailyMail and Newsroom datasets. It follows the popular encoderdecoder paradigm, but with an extra focus on the encoder. The intuition is that the probability of correctly decoding an information significantly lies in the pattern and correctness of the encoder. Hence we introduce, encodeencode-decode. A framework that encodes the source text first with a transformer, then a sequence-to-sequence (seq2seq) model. We find that the transformer and seq2seq model complement themselves adequately, making for a richer encoded vector representation. We also find that paying more attention to the vocabulary of target words during abstraction improves performance. We experiment our hypothesis and framework on the task of extractive and abstractive single document summarization and evaluate using the standard CNN/DailyMail dataset and the recently released Newsroom dataset.
We demonstrate the viability of knowledge transfer between two related tasks: machine reading comprehension (MRC) and querybased text summarization. Using an MRC model trained on the SQuAD1.1 dataset as a core system component, we first build an extractive query-based summarizer. For better precision, this summarizer also compresses the output of the MRC model using a novel sentence compression technique. We further leverage pre-trained machine translation systems to abstract our extracted summaries. Our models achieve state-of-the-art results on the publicly available CNN/Daily Mail and Debatepedia datasets, and can serve as simple yet powerful baselines for future systems. We also hope that these results will encourage research on transfer learning from large MRC corpora to query-based summarization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.