A Comprehensive Typing System for Information Extraction from Clinical Narratives

Caufield, J. Harry; Zhou, Yichao; Bai, Yunsheng; Liem, David A.; Garlid, Anders O.; Chang, Kai-Wei; Sun, Yizhou; Ping, Peipei; Wang, Wei

doi:10.1101/19009118

Cited by 13 publications

(16 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The authors in [ 32 – 34 ] ensemble conditional random fields [ 35 ] with convolutional neural networks [ 36 ] or recurrent neural networks [ 37 ], requiring extensive human annotation effort at the training stage which is expensive and time-consuming. We thus collect datasets from multiple tasks including I2B2-2010 [ 38 ], CORD-NER [ 39 ] and MACCROBAT2018 [ 40 ] and jointly fine-tune a deep language model to encode the tokens from the social media data. One layer of the Feed Forward Network (FNN) [ 41 ] with softmax [ 42 ] takes the hidden representations of each token as input and outputs the category of this token.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

COVID-19 Surveiller: toward a robust and effective pandemic surveillance system based on social media mining

Jiang

Zhou

Chen

et al. 2021

Phil. Trans. R. Soc. A.

Self Cite

View full text Add to dashboard Cite

The outbreak of the novel coronavirus, COVID-19, has become one of the most severe pandemics in human history. In this paper, we propose to leverage social media users as social sensors to simultaneously predict the pandemic trends and suggest potential risk factors for public health experts to understand spread situations and recommend proper interventions. More precisely, we develop novel deep learning models to recognize important entities and their relations over time, thereby establishing dynamic heterogeneous graphs to describe the observations of social media users. A dynamic graph neural network model can then forecast the trends (e.g. newly diagnosed cases and death rates) and identify high-risk events from social media. Based on the proposed computational method, we also develop a web-based system for domain experts without any computer science background to easily interact with. We conduct extensive experiments on large-scale datasets of COVID-19 related tweets provided by Twitter, which show that our method can precisely predict the new cases and death rates. We also demonstrate the robustness of our web-based pandemic surveillance system and its ability to retrieve essential knowledge and derive accurate predictions across a variety of circumstances. Our system is also available at http://scaiweb.cs.ucla.edu/covidsurveiller/ . This article is part of the theme issue ‘Data science approachs to infectious disease surveillance’.

show abstract

Section: Methodsmentioning

confidence: 99%

“…determining whether a relation exists between two recognized entities. We aggregate datasets from multiple tasks including Wiki80 [ 50 ], I2B2-2012 [ 51 ] and MACCROBAT2018 [ 40 ] to generate the positive instances, i.e. sentences containing two entities and a True relation between them.…”

Section: Methodsmentioning

confidence: 99%

COVID-19 Surveiller: toward a robust and effective pandemic surveillance system based on social media mining

Jiang

Zhou

Chen

et al. 2021

Phil. Trans. R. Soc. A.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Without the loss of generality, we leverage BERT model to provide contextualized embeddings and learn a supervised named entity recognizer. To overcome the problem with the nonexistence of annotated tweets as training data, we collect the benchmark corpora and their annotations for multiple NER tasks, including I2B2-2010 [20], CORD-NER [78] and MACCROBAT-2018 [12]. Based on those external datasets, we jointly learn a recognition model to extract entities on the COVID-19 related tweets data.…”

Section: Constructing Dynamic Knowledge Graphs From Social Media Datamentioning

confidence: 99%

“…To overcome the above challenge, we convert the multi-class prediction task to a binary classification problem of only identifying the existence of a potential relationship between any entity pair in each tweet instance. We aggregate datasets from multiple tasks including Wiki80 [26], I2B2-2012 [73], and MAACROBAT-2018 [12] to create the positive training data (labeled as 'True'). In order to achieve balanced training, validation and test datasets, we apply negative sampling to create the same number of instances with the label 'False'.…”

Section: Constructing Dynamic Knowledge Graphs From Social Media Datamentioning

confidence: 99%

#StayHome or #Marathon? Social Media Enhanced Pandemic Surveillance on Spatial-temporal Dynamic Graphs

Zhou,

Jiang,

Chen

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

COVID-19 has caused lasting damage to almost every domain in public health, society, and economy. To monitor the pandemic trend, existing studies rely on the aggregation of traditional statistical models and epidemic spread theory. In other words, historical statistics of COVID-19, as well as the population mobility data, become the essential knowledge for monitoring the pandemic trend. However, these solutions can barely provide precise prediction and satisfactory explanations on the long-term disease surveillance while the ubiquitous social media resources can be the key enabler for solving this problem. For example, serious discussions may occur on social media before and after some breaking events take place. These events, such as marathon and parade, may impact the spread of the virus. To take advantage of the social media data, we propose a novel framework, Social Media enhAnced pandemic suRveillance Technique (SMART), which is composed of two modules: (i) information extraction module to construct heterogeneous knowledge graphs based on the extracted events and relationships among them; (ii) time series prediction module to provide both short-term and long-term forecasts of the confirmed cases and fatality at the state-level in the United States and to discover risk factors for COVID-19 interventions. Extensive experiments show that our method largely outperforms the state-of-the-art baselines by 7.3% and 7.4% in confirmed case/fatality prediction, respectively. CCS CONCEPTS• Information systems → Data mining.

show abstract

“…Case reports are a time-honored means of sharing observations and insights about novel patient cases [1], [2]. As of 2020, at least 160 case report journals were in existence, with over 90% having open access policies and almost half indexed by PubMed [3].…”

Section: Introductionmentioning

confidence: 99%

CREATe: Clinical Report Extraction and Annotation Technology

Zhou

Zhang

Lee

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Clinical case reports are written descriptions of the unique aspects of a particular clinical case, playing an essential role in sharing clinical experiences about atypical disease phenotypes and new therapies. However, to our knowledge, there has been no attempt to develop an end-to-end system to annotate, index, or otherwise curate these reports. In this paper, we propose a novel computational resource platform, CREATe, for extracting, indexing, and querying the contents of clinical case reports. CREATe fosters an environment of sustainable resource support and discovery, enabling researchers to overcome the challenges of information science. An online video of the demonstration can be viewed at https://youtu.be/Q8owBQYTjDc.

show abstract

A Comprehensive Typing System for Information Extraction from Clinical Narratives

Cited by 13 publications

References 12 publications

COVID-19 Surveiller: toward a robust and effective pandemic surveillance system based on social media mining

COVID-19 Surveiller: toward a robust and effective pandemic surveillance system based on social media mining

#StayHome or #Marathon? Social Media Enhanced Pandemic Surveillance on Spatial-temporal Dynamic Graphs

CREATe: Clinical Report Extraction and Annotation Technology

Contact Info

Product

Resources

About