Yancheng He scite author profile

Automatic question generation is an important technique that can improve the training of question answering, help chatbots to start or continue a conversation with humans, and provide assessment materials for educational purposes. Existing neural question generation models are not sufficient mainly due to their inability to properly model the process of how each word in the question is selected, i.e., whether repeating the given passage or being generated from a vocabulary. In this paper, we propose our Clue Guided Copy Network for Question Generation (CGC-QG), which is a sequence-to-sequence generative model with copying mechanism, yet employing a variety of novel components and techniques to boost the performance of question generation. In CGC-QG, we design a multi-task labeling strategy to identify whether a question word should be copied from the input passage or be generated instead, guiding the model to learn the accurate boundaries between copying and generation. Furthermore, our input passage encoder takes as input, among a diverse range of other features, the prediction made by a clue word predictor, which helps identify whether each word in the input passage is a potential clue to be copied into the target question. The clue word predictor is designed based on a novel application of Graph Convolutional Networks onto a syntactic dependency tree representation of each passage, thus being able to predict clue words only based on their context in the passage and their relative positions to the answer in the tree. We jointly train the clue prediction as well as question generation with multi-task learning and a number of practical strategies to reduce the complexity. Extensive evaluations show that our model significantly improves the performance of question generation and out-performs all previous state-of-the-art neural question generation models by a substantial margin.

show abstract

Matching Article Pairs with Graphical Decomposition and Convolutions

Liu¹,

Niu²,

Wei³

et al. 2019

View full text Add to dashboard Cite

Identifying the relationship between two articles, e.g., whether two articles published from different sources describe the same breaking news, is critical to many document understanding tasks. Existing approaches for modeling and matching sentence pairs do not perform well in matching longer documents, which embody more complex interactions between the enclosed entities than a sentence does. To model article pairs, we propose the Concept Interaction Graph to represent an article as a graph of concepts. We then match a pair of articles by comparing the sentences that enclose the same concept vertex through a series of encoding techniques, and aggregate the matching signals through a graph convolutional network. To facilitate the evaluation of long article matching, we have created two datasets, each consisting of about 30K pairs of breaking news articles covering diverse topics in the open domain. Extensive evaluations of the proposed methods on the two datasets demonstrate significant improvements over a wide range of state-of-the-art methods for natural language matching.

show abstract

Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus

Liu

Wei

Niu

et al. 2020

View full text Add to dashboard Cite

The ability to ask questions is important in both human and machine intelligence. Learning to ask questions helps knowledge acquisition, improves question-answering and machine reading comprehension tasks, and helps a chatbot to keep the conversation flowing with a human. Existing question generation models are ineffective at generating a large amount of high-quality question-answer pairs from unstructured text, since given an answer and an input passage, question generation is inherently a one-to-many mapping. In this paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which aims at automatically generating high-quality and diverse question-answer pairs from unlabeled text corpus at scale by imitating the way a human asks questions. Our system consists of: i) an information extractor, which samples from the text multiple types of assistive information to guide question generation; ii) neural question generators, which generate diverse and controllable questions, leveraging the extracted assistive information; and iii) a neural quality controller, which removes low-quality generated data based on text entailment. We compare our question generation models with existing approaches and resort to voluntary human evaluation to assess the quality of the generated question-answer pairs. The evaluation results suggest that our system dramatically outperforms state-of-the-art neural question generation models in terms of the generation quality, while being scalable in the meantime. With models trained on a relatively smaller amount of data, we can generate 2.8 million quality-assured question-answer pairs from a million sentences found in Wikipedia.

show abstract

Coherent Comments Generation for Chinese Articles with a Graph-to-Sequence Model

Li¹,

Xu²,

He³

et al. 2019

View full text Add to dashboard Cite

Automatic article commenting is helpful in encouraging user engagement and interaction on online news platforms. However, the news documents are usually too long for traditional encoder-decoder based models, which often results in general and irrelevant comments. In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph. By organizing the article into graph structure, our model can better understand the internal structure of the article and the connection between topics, which makes it better able to understand the story. We collect and release a large scale news-comment corpus from a popular Chinese online news platform Tencent Kuaibao. 1 Extensive experiment results show that our model can generate much more coherent and informative comments compared with several strong baseline models. 2 the paper is available at https://github.com/lancopku/ Graph-to-seq-comment-generation • Users focus on different aspects (topics) of the news when making comments, which

show abstract

Diversity and frequency of resistance and virulence genes in blaKPC and blaNDM co-producing Klebsiella pneumoniae strains from China

Liu

Zhang

et al. 2019

IDR

View full text Add to dashboard Cite

BackgroundEmergence of blaKPC and blaNDM co-producing Klebsiella pneumoniae strains have led to the limited therapeutic options for clinical treatment. Understanding the diversity and frequency of resistance and virulence genes of these isolates is of great significance.PurposeThe aim of this study is to research the diversity and frequency of resistance and virulence genes in the blaKPC and blaNDM co-producing Klebsiella pneumoniae strains.Methods and ResultsIn this study, 117 K. pneumonia strains were isolated from China, and among of which, 24 were found to be blaKPC and blaNDM co-producing with significant resistance against almost all the commonly used antibiotics. Additionally, 4 strains were hypermucoviscous and 8 showed high serum resistance. Overall, blaSHV, blaCTX-M, tetA and sul1 resistance genes found in 100% of the isolates, followed by blaTEM (95.8%), oqxA/B (91.7%), qnrB (87.5%), aac(6’)Ib-cr (83.3%), blaDHA (79.2%), rmtB (66.7%), qnrS (54.2%), cat(54.2%), floR (50.0%), sul2 (45.8%) cmlA (20.8%)andblaCMY (8.33%), respectively. What’ more, seven blaCTX-M subtypes [blaCTX-M-14 (n=18), blaCTX-M-3(n=11), blaCTX-M-65 (n=4), blaCTX-M-15 (n=3), blaCTX-M-28 (n=2), blaCTX-M-55 (n=2), blaCTX-M-22 (n=1)] and six blaSHV subtypes [blaSHV-12(n=16), blaSHV-11 (n=4), blaSHV-2a(n=1), blaSHV-1(n=1), blaSHV-38(n=1) and blaSHV-28(n=1)] were detected. The frequency of virulence genes was as follows: 100% for entB, ybtS and irp, 95.8% for mrkD, 91.66% for fimH, 79.2% for iutA, 62.5% for iroBCDE, aerobactin and kfu, 66.7% for allS, 45.8% for wcaG, 37.5% for rmpA, 20.8% for pagO and 16.7% for magA.ConclusionFrom this study, we concluded that the blaKPC and blaNDM co-producing Klebsiella pneumoniae strains have a high diversity and frequency of resistance and virulence genes. This study may offer hospitals important information about the control of infections caused by blaKPC and blaNDM co-producing Klebsiella pneumoniae.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yancheng He

Learning to Generate Questions by LearningWhat not to Generate

Matching Article Pairs with Graphical Decomposition and Convolutions

Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus

Coherent Comments Generation for Chinese Articles with a Graph-to-Sequence Model

<p>Diversity and frequency of resistance and virulence genes in <em>bla</em><sub>KPC</sub> and <em>bla</em><sub>NDM</sub> co-producing <em>Klebsiella pneumoniae</em> strains from China</p>

Contact Info

Product

Resources

About