Matthew Shardlow scite author profile

Summary1. Evidence-based policy requires researchers to provide the answers to ecological questions that are of interest to policy makers. To find out what those questions are in the UK, representatives from 28 organizations involved in policy, together with scientists from 10 academic institutions, were asked to generate a list of questions from their organizations. 2. During a 2-day workshop the initial list of 1003 questions generated from consulting at least 654 policy makers and academics was used as a basis for generating a short list of 100 questions of significant policy relevance. Short-listing was decided on the basis of the preferences of the representatives from the policy-led organizations. 3. The areas covered included most major issues of environmental concern in the UK, including agriculture, marine fisheries, climate change, ecosystem function and land management. 4. The most striking outcome was the preference for general questions rather than narrow ones. The reason is that policy is driven by broad issues rather than specific ones. In contrast, scientists are frequently best equipped to answer specific questions. This means that it may be necessary to extract the underpinning specific question before researchers can proceed. Synthesis and applications.Greater communication between policy makers and scientists is required in order to ensure that applied ecologists are dealing with issues in a way that can feed into policy. It is particularly important that applied ecologists emphasize the generic value of their work wherever possible.

show abstract

A Survey of Automated Text Simplification

Shardlow¹

2014

SpecialIssue

136

132

View full text Add to dashboard Cite

Abstract-Text simplification modifies syntax and lexicon to improve the understandability of language for an end user. This survey identifies and classifies simplification research within the period 1998-2013. Simplification can be used for many applications, including: Second language learners, preprocessing in pipelines and assistive technology. There are many approaches to the simplification task, including: lexical, syntactic, statistical machine translation and hybrid techniques. This survey also explores the current challenges which this field faces. Text simplification is a non-trivial task which is rapidly growing into its own field. This survey gives an overview of contemporary research whilst taking into account the history that has brought text simplification to its current state.

show abstract

Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts

Korkontzelos

Nikfarjam

Shardlow

et al. 2016

Journal of Biomedical Informatics

149

View full text Add to dashboard Cite

show abstract

Identification of research hypotheses and new knowledge from scientific literature

Shardlow

Batista-Navarro

Thompson

et al. 2018

BMC Med Inform Decis Mak

View full text Add to dashboard Cite

BackgroundText mining (TM) methods have been used extensively to extract relations and events from the literature. In addition, TM techniques have been used to extract various types or dimensions of interpretative information, known as Meta-Knowledge (MK), from the context of relations and events, e.g. negation, speculation, certainty and knowledge type. However, most existing methods have focussed on the extraction of individual dimensions of MK, without investigating how they can be combined to obtain even richer contextual information. In this paper, we describe a novel, supervised method to extract new MK dimensions that encode Research Hypotheses (an author’s intended knowledge gain) and New Knowledge (an author’s findings). The method incorporates various features, including a combination of simple MK dimensions.MethodsWe identify previously explored dimensions and then use a random forest to combine these with linguistic features into a classification model. To facilitate evaluation of the model, we have enriched two existing corpora annotated with relations and events, i.e., a subset of the GENIA-MK corpus and the EU-ADR corpus, by adding attributes to encode whether each relation or event corresponds to Research Hypothesis or New Knowledge. In the GENIA-MK corpus, these new attributes complement simpler MK dimensions that had previously been annotated.ResultsWe show that our approach is able to assign different types of MK dimensions to relations and events with a high degree of accuracy. Firstly, our method is able to improve upon the previously reported state of the art performance for an existing dimension, i.e., Knowledge Type. Secondly, we also demonstrate high F1-score in predicting the new dimensions of Research Hypothesis (GENIA: 0.914, EU-ADR 0.802) and New Knowledge (GENIA: 0.829, EU-ADR 0.836).ConclusionWe have presented a novel approach for predicting New Knowledge and Research Hypothesis, which combines simple MK dimensions to achieve high F1-scores. The extraction of such information is valuable for a number of practical TM applications.Electronic supplementary materialThe online version of this article (10.1186/s12911-018-0639-1) contains supplementary material, which is available to authorized users.

show abstract

SemEval-2021 Task 1: Lexical Complexity Prediction

Shardlow¹,

Evans²,

Paetzold³

et al. 2021

View full text Add to dashboard Cite

This paper presents the results and main findings of SemEval-2021 Task 1 -Lexical Complexity Prediction. We provided participants with an augmented version of the CompLex Corpus (Shardlow et al., 2020). CompLex is an English multi-domain corpus in which words and multi-word expressions (MWEs) were annotated with respect to their complexity using a five point Likert scale. SemEval-2021 Task 1 featured two Sub-tasks: Sub-task 1 focused on single words and Sub-task 2 focused on MWEs. The competition attracted 198 teams in total, of which 54 teams submitted official runs on the test data to Sub-task 1 and 37 to Sub-task 2.

show abstract

Predicting lexical complexity in English texts: the Complex 2.0 dataset

Shardlow

Evans

Zampieri

2022

Lang Resources & Evaluation

View full text Add to dashboard Cite

Identifying words which may cause difficulty for a reader is an essential step in most lexical text simplification systems prior to lexical substitution and can also be used for assessing the readability of a text. This task is commonly referred to as complex word identification (CWI) and is often modelled as a supervised classification problem. For training such systems, annotated datasets in which words and sometimes multi-word expressions are labelled regarding complexity are required. In this paper we analyze previous work carried out in this task and investigate the properties of CWI datasets for English. We develop a protocol for the annotation of lexical complexity and use this to annotate a new dataset, CompLex 2.0. We present experiments using both new and old datasets to investigate the nature of lexical complexity. We found that a Likert-scale annotation protocol provides an objective setting that is superior for identifying the complexity of words compared to a binary annotation protocol. We release a new dataset using our new protocol to promote the task of Lexical Complexity Prediction.

show abstract

Neural Text Simplification of Clinical Letters with a Domain Specific Phrase Table

Shardlow¹,

Nawaz²

2019

View full text Add to dashboard Cite

Clinical letters are infamously impenetrable for the lay patient. This work uses neural text simplification methods to automatically improve the understandability of clinical letters for patients. We take existing neural text simplification software and augment it with a new phrase table that links complex medical terminology to simpler vocabulary by mining SNOMED-CT. In an evaluation task using crowdsourcing, we show that the results of our new system are ranked easier to understand (average rank 1.93) than using the original system (2.34) without our phrase table. We also show improvement against baselines including the original text (2.79) and using the phrase table without the neural text simplification software (2.94). Our methods can easily be transferred outside of the clinical domain by using domain-appropriate resources to provide effective neural text simplification for any domain without the need for costly annotation.

show abstract

Text mining resources for the life sciences

Przybyła¹,

Shardlow²,

Aubin³

et al. 2016

View full text Add to dashboard Cite

Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable—those that have the crucial ability to share information, enabling smooth integration and reusability.

show abstract

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.