Rui Meng scite author profile

Keyphrase provides highly-summative information that can be effectively used for understanding, organizing and retrieving text content. Though previous studies have provided many workable solutions for automated keyphrase extraction, they commonly divided the to-be-summarized content into multiple text chunks, then ranked and selected the most meaningful ones. These approaches could neither identify keyphrases that do not appear in the text, nor capture the real semantic meaning behind the text. We propose a generative model for keyphrase prediction with an encoder-decoder framework, which can effectively overcome the above drawbacks. We name it as deep keyphrase generation since it attempts to capture the deep semantic meaning of the content with a deep learning method. Empirical analysis on six datasets demonstrates that our proposed model not only achieves a significant performance boost on extracting keyphrases that appear in the source text, but also can generate absent keyphrases based on the semantic meaning of the text. Code and dataset are available at https://github.com/memray/seq2seq-keyphrase.

show abstract

One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases

Yuan

Wang

Meng

et al. 2020

122

View full text Add to dashboard Cite

Different texts shall by nature correspond to different number of keyphrases. This desideratum is largely missing from existing neural keyphrase generation models. In this study, we address this problem from both modeling and evaluation perspectives.We first propose a recurrent generative model that generates multiple keyphrases as delimiter-separated sequences. Generation diversity is further enhanced with two novel techniques by manipulating decoder hidden states. In contrast to previous approaches, our model is capable of generating diverse keyphrases and controlling number of outputs.We further propose two evaluation metrics tailored towards the variable-number generation. We also introduce a new dataset (ST A C KEX) that expands beyond the only existing genre (i.e., academic writing) in keyphrase generation tasks. With both previous and new evaluation metrics, our model outperforms strong baselines on all datasets.

show abstract

Integrating Transformer and Paraphrase Rules for Sentence Simplification

Zhao¹,

Meng²,

He³

et al. 2018

View full text Add to dashboard Cite

Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from machine translation studies and implicitly learned simplification mapping rules from normalsimple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we propose two innovative approaches to integrate the Simple PPDB (A Paraphrase Database for Simplification), an external paraphrase knowledge base for simplification that covers a wide range of real-world simplification rules. The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple stateof-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule utilization, the model seeks to select more accurate simplification rules. The code and models used in the paper are available at https://github.com/ Sanqiang/text_simplification.

show abstract

Does Order Matter? An Empirical Study on Generating Multiple Keyphrases as a Sequence

Meng¹,

Yuan²,

Wang³

et al. 2019

Preprint

View full text Add to dashboard Cite

Recently, concatenating multiple keyphrases as a target sequence has been proposed as a new learning paradigm for keyphrase generation. Existing studies concatenate target keyphrases in different orders but no study has examined the effects of ordering on models' behavior. In this paper, we propose several orderings for concatenation and inspect the important factors for training a successful keyphrase generation model. By running comprehensive comparisons, we observe one preferable ordering and summarize a number of empirical findings and challenges, which can shed light on future research on this line of work.

show abstract

Deep Keyphrase Generation

Meng

Zhao

Han

et al. 2017

Preprint

View full text Add to dashboard Cite

A comparison of self-reported and proxy-reported health utilities in children: a systematic review and meta-analysis

Jiang

et al. 2021

Health Qual Life Outcomes

View full text Add to dashboard Cite

Objective This study aimed to conduct a systematic review and meta-analysis to compare differences in health utilities (HUs) assessed by self and proxy respondents in children, as well as to evaluate the effects of health conditions, valuation methods, and proxy types on the differences. Methods Eligible studies published in PubMed, Embase, Web of Science, and Cochrane Library up to December 2019 were identified according to PRISMA guidelines. Meta-analyses were performed to calculate the weighted mean differences (WMDs) in HUs between proxy- versus self-reports. Mixed-effects meta-regressions were applied to explore differences in WMDs among each health condition, valuation method and proxy type. Results A total of 30 studies were finally included, comprising 211 pairs of HUs assessed by 15,294 children and 16,103 proxies. This study identified 34 health conditions, 10 valuation methods, and 3 proxy types. In general, proxy-reported HUs were significantly different from those assessed by children themselves, while the direction and magnitude of these differences were inconsistent regarding health conditions, valuation methods, and proxy types. Meta-regression demonstrated that WMDs were significantly different in patients with ear diseases relative to the general population; in those measured by EQ-5D, Health utility index 2 (HUI2), and Pediatric asthma health outcome measure relative to Visual analogue scale method; while were not significantly different in individuals adopting clinician-proxy and caregiver-proxy relative to parent-proxy. Conclusion Divergence existed in HUs between self and proxy-reports. Our findings highlight the importance of selecting appropriate self and/or proxy-reported HUs in health-related quality of life measurement and economic evaluations.

show abstract

One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases

Yuan

Wang

Meng

et al. 2018

Preprint

View full text Add to dashboard Cite

Different texts shall by nature correspond to different number of keyphrases. This desideratum is largely missing from existing neural keyphrase generation models. In this study, we address this problem from both modeling and evaluation perspectives.We first propose a recurrent-generative model that generates multiple keyphrases as delimiter-separated sequences. Generation diversity is further enhanced with two novel techniques by manipulating decoder hidden states. In contrast to previous approaches, our model is capable of generating variable number of diverse keyphrases.We further propose two evaluation metrics tailored towards variable-number generation. We also introduce a new dataset (ST A C KEX) that expand beyond the only existing genre (i.e., academic writing) in keyphrase generation tasks. With both previous and new evaluation metrics, our model outperforms strong baselines on all datasets.

show abstract

A hidden Markov model for population‐level cervical cancer screening data

Soper

Nygård

Abdulla

et al. 2020

Statistics in Medicine

View full text Add to dashboard Cite

has been administrating a national cervical cancer screening program since 1992 by coordinating triennial cytology exam screenings for the female population between 25 and 69 years of age. Up to 80% of cancers are prevented through mass screening, but this comes at the expense of considerable screening activity and leads to overtreatment of clinically asymptomatic precancers. In this article, we present a continuous-time, time-inhomogeneous hidden Markov model which was developed to understand the screening process and cervical cancer carcinogenesis in detail. By leveraging 1.7 million individual's multivariate time-series of medical exams performed over a 25-year period, we simultaneously estimate all model parameters. We show that an age-dependent model reflects the Norwegian screening program by comparing empirical survival curves from observed registry data and data simulated from the proposed model. The model can be generalized to include more detailed individual-level covariates as well as new types of screening exams. By utilizing individual screening histories and covariate data, the proposed model shows potential for improving strategies for cancer screening programs by personalizing recommended screening intervals.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rui Meng

Deep Keyphrase Generation

One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases

Integrating Transformer and Paraphrase Rules for Sentence Simplification

Does Order Matter? An Empirical Study on Generating Multiple Keyphrases as a Sequence

Deep Keyphrase Generation

A comparison of self-reported and proxy-reported health utilities in children: a systematic review and meta-analysis

One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases

A hidden Markov model for population‐level cervical cancer screening data

Contact Info

Product

Resources

About