Florian Boudin scite author profile

BackgroundFormulating a clinical information need in terms of the four atomic parts which are Population/Problem, Intervention, Comparison and Outcome (known as PICO elements) facilitates searching for a precise answer within a large medical citation database. However, using PICO defined items in the information retrieval process requires a search engine to be able to detect and index PICO elements in the collection in order for the system to retrieve relevant documents.MethodsIn this study, we tested multiple supervised classification algorithms and their combinations for detecting PICO elements within medical abstracts. Using the structural descriptors that are embedded in some medical abstracts, we have automatically gathered large training/testing data sets for each PICO element.ResultsCombining multiple classifiers using a weighted linear combination of their prediction scores achieves promising results with an f-measure score of 86.3% for P, 67% for I and 56.6% for O.ConclusionsOur experiments on the identification of PICO elements showed that the task is very challenging. Nevertheless, the performance achieved by our identification method is competitive with previously published results and shows that this task can be achieved with a high accuracy for the P element but lower ones for I and O elements.

show abstract

Unsupervised Keyphrase Extraction with Multipartite Graphs

Boudin¹

2018

150

View full text Add to dashboard Cite

We propose an unsupervised keyphrase extraction model that encodes topical information within a multipartite graph structure. Our model represents keyphrase candidates and topics in a single graph and exploits their mutually reinforcing relationship to improve candidate ranking. We further introduce a novel mechanism to incorporate keyphrase selection preferences into the model. Experiments conducted on three widely used datasets show significant improvements over state-of-the-art graph-based models.

show abstract

Concept-based Summarization using Integer Linear Programming: From Concept Pruning to Multiple Optimal Solutions

Boudin¹,

Mougard²,

Favre³

2015

View full text Add to dashboard Cite

In concept-based summarization, sentence selection is modelled as a budgeted maximum coverage problem. As this problem is NP-hard, pruning low-weight concepts is required for the solver to find optimal solutions efficiently. This work shows that reducing the number of concepts in the model leads to lower ROUGE scores, and more importantly to the presence of multiple optimal solutions. We address these issues by extending the model to provide a single optimal solution, and eliminate the need for concept pruning using an approximation algorithm that achieves comparable performance to exact inference.

show abstract

KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents

Gallina¹,

Boudin²,

Daille³

2019

View full text Add to dashboard Cite

Keyphrase generation is the task of predicting a set of lexical units that conveys the main content of a source text. Existing datasets for keyphrase generation are only readily available for the scholarly domain and include non-expert annotations. In this paper we present KPTimes, a large-scale dataset of news texts paired with editor-curated keyphrases. Exploring the dataset, we show how editors tag documents, and how their annotations differ from those found in existing datasets. We also train and evaluate state-of-the-art neural keyphrase generation models on KPTimes to gain insights on how well they perform on the news domain. The dataset is available online at https://github.com/ygorg/KPTimes.

show abstract

Improving Medical Information Retrieval with PICO Element Detection

Boudin

Shi

Nie

2010

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Florian Boudin

Combining classifiers for robust PICO element detection

Unsupervised Keyphrase Extraction with Multipartite Graphs

Concept-based Summarization using Integer Linear Programming: From Concept Pruning to Multiple Optimal Solutions

KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents

Improving Medical Information Retrieval with PICO Element Detection

Contact Info

Product

Resources

About