Setu Shah scite author profile

Human Vaccines & Immunotherapeutics

2019

In this research, we developed a natural language processing (NLP) framework to investigate the opinions on HPV vaccination reflected on Twitter over a 10-year period-2008-2017. The NLP framework includes sentiment analysis, entity analysis, and artificial intelligence (AI)-based phrase association mining. The sentiment analysis demonstrates the sentiment fluctuation over the past 10 years. The results show that there are more negative tweets in 2008 to 2011 and 2015 to 2016. The entity extraction and analysis help to identify the organization, geographical location and events entities associated with the negative and positive tweets. The results show that the organization entities such as FDA, CDC and Merck occur in both negative and positive tweets of almost every year, whereas the geographical location entities mentioned in both negative and positive tweets change from year to year. The reason is because of the specific events that happened in those different locations. The objective of the AI-based phrase association mining is to identify the main topics reflected in both negative and positive tweets and detailed tweet content. Through the phrase association mining, we found that the main negative topics on Twitter include "injuries", "deaths", "scandal", "safety concerns", and "adverse/ side effects", whereas the main positive topics include "cervical cancers", "cervical screens", "prevents", and "vaccination campaigns". We believe the results of this research can help public health researchers better understand the nature of social media influence on HPV vaccination attitudes and to develop strategies to counter the proliferation of misinformation.

Neural networks for mining the associations between diseases and symptoms in clinical notes

Luo

Kanakasabai

et al. 2018

Health Inf Sci Syst

There are challenges for analyzing the narrative clinical notes in Electronic Health Records (EHRs) because of their unstructured nature. Mining the associations between the clinical concepts within the clinical notes can support physicians in making decisions, and provide researchers evidence about disease development and treatment. In this paper, in order to model and analyze disease and symptom relationships in the clinical notes, we present a concept association mining framework that is based on word embedding learned through neural networks. The approach is tested using 154,738 clinical notes from 500 patients, which are extracted from the Indiana University Health's Electronic Health Records system. All patients are diagnosed with more than one type of disease. The results show that this concept association mining framework can identify related diseases and symptoms. We also propose a method to visualize a patients' diseases and related symptoms in chronological order. This visualization can provide physicians an overview of the medical history of a patient and support decision making. The presented approach can also be expanded to analyze the associations of other clinical concepts, such as social history, family history, medications, etc.

Comparison of deep learning based concept representations for biomedical document clustering

Luo

2018

In this research, document representations based on distributed representations of the concepts along with new weighting schemes for the documents are explored. The baseline weighting scheme is the traditional Term Frequency-Inverse Document Frequency (TF-IDF) of the concepts, whereas, the other two newly proposed ones consider both local content using the TF-IDF and associations between concepts. The distributed representations of the concepts are measured using a deep learning algorithm. The evaluation of the proposed document representations is based on the k-means clustering results. The results show that document representation based on TF-IDF in combination with the term based distributed representations for concepts outperforms the other two based on the returned evaluation metrics-F1-measure (80.21%) and Purity (77.1%).

Differential Learning for Outliers: A Case Study of Water Demand Prediction

Miled

Schaefer³

et al. 2018

Applied Sciences

Predicting water demands is becoming increasingly critical because of the scarcity of this natural resource. In fact, the subject was the focus of numerous studies by a large number of researchers around the world. Several models have been proposed that are able to predict water demands using both statistical and machine learning techniques. These models have successfully identified features that can impact water demand trends for rural and metropolitan areas. However, while the above models, including recurrent network models proposed by the authors are able to predict normal water demands, most have difficulty estimating potential deviations from the norms. Outliers in water demand can be due to various reasons including high temperatures and voluntary or mandatory consumption restrictions by the water utility companies. Estimating these deviations is necessary, especially for water utility companies with a small service footprint, in order to efficiently plan water distribution. This paper proposes a differential learning model that can help model both over-consumption and under-consumption. The proposed differential model builds on a previously proposed recurrent neural network model that was successfully used to predict water demand in central Indiana.