“…Search query logs have been used to forecast the spreading of influenza epidemics [15]. Microblogging platforms like Twitter have been used to monitor the public health discourse [16,17] and to estimate the incidence of a wide range of pathologies from mental illnesses [18] to obesity [19].…”
Background: Conditions associated to the metabolic syndrome have an enormous impact on people's health. It is estimated that more than 300k premature deaths in Europe are caused by obesity and more than half of European citizens will be obese by 2050. That represents a heavy burden on healthcare, with more than €70B spent every year in Europe. Since healthy eating is one of the most effective intervention to counter the risks of metabolic syndrome, monitoring food consumption at scale is key for effective prevention. Traditional nutrition studies are costly and, most often, of limited scale. To partly fix that, researchers have resorted to digital data to infer what people eat and to estimate how that relates to their health.
“…Search query logs have been used to forecast the spreading of influenza epidemics [15]. Microblogging platforms like Twitter have been used to monitor the public health discourse [16,17] and to estimate the incidence of a wide range of pathologies from mental illnesses [18] to obesity [19].…”
Background: Conditions associated to the metabolic syndrome have an enormous impact on people's health. It is estimated that more than 300k premature deaths in Europe are caused by obesity and more than half of European citizens will be obese by 2050. That represents a heavy burden on healthcare, with more than €70B spent every year in Europe. Since healthy eating is one of the most effective intervention to counter the risks of metabolic syndrome, monitoring food consumption at scale is key for effective prevention. Traditional nutrition studies are costly and, most often, of limited scale. To partly fix that, researchers have resorted to digital data to infer what people eat and to estimate how that relates to their health.
“…Several authors explored tweets content using latent Dirichlet allocation (LDA) to identify health topics, including tobacco use [17] , seasonal influenza and allergies [18], and childhood obesity [19]. Alternate data sources can be explored, e.g., Sullivan et al built a scoring system for food supplements based on the analysis of users comments on amazon.com [20].…”
Section: Prior Workmentioning
confidence: 99%
“…The original version of LDA modeling proposed by Blei et al [22] has been widely used, e.g., [8,17,19,21,23]. Paul and Dredze developed extensions of the LDA model [18,24,25].…”
Section: Prior Workmentioning
confidence: 99%
“…With the aim of optimizing interpretability and semantic coherence of topics, as done in Prier et al [17], we considered a message significantly associated to a topic when at least 25% of the tokens it contained were associated to this topic. The 25% threshold was set empirically.…”
Section: Lda Modelingmentioning
confidence: 99%
“…Such empirical approach in the application of these methods is frequently reported in the literature; e.g., Prier et al [17] set a suitable number of topics for their corpus by testing thresholds set every 50 topics.…”
Background: Medication non-adherence is a major impediment to the management of many health conditions. A better understanding of the factors underlying non compliance to treatment may help health professionals to address it. Patients use social media to share their experiences regarding their treatments and their diseases. Using topic models makes it possible to model themes present in a collection of posts, thus to identify cases of non-compliance. Objective: Our study aims to detect messages describing patients' non-compliant behaviors. They are associated with a drug of interest. Thus, our objective is the clustering of posts featuring a homogeneous vocabulary related to non-adherent attitudes. Methods: We implemented a probabilistic topic model in order to identify the topics that occurred in a corpus of online messages posted between 2004 and 2013 on three of the most popular French forums. Data were collected using a Web Crawler designed by Kappa Santé as part of the Detec't project to analyze social media for drug safety. Several topics were related to non-compliance to treatment. Results: Starting from a corpus of 3 650 posts related to an antidepressant drug, (escitalopram), and 2,164 posts related to an antipsychotic drug (aripiprazole), the use of latent Dirichlet allocation allowed us to model several themes including interruptions of treatment and changes in dosage. The topic model approach detected cases of non compliance behaviors with a recall of 98.5% and a precision of 32.6%. Conclusions: Topic models enabled us to explore patients' discussions on community websites and to identify posts related with non-compliant behaviors. After a manual review of the messages in the non-compliance topics, we found that non-compliance to treatment was present in almost 6% of the posts.
Social media based digital epidemiology has the potential to support faster response and deeper understanding of public health related threats. This study proposes a new framework to analyze unstructured health related textual data via Twitter users' post (tweets) to characterize the negative health sentiments and non-health related concerns in relations to the corpus of negative sentiments regarding diet, diabetes, exercise and obesity (DDEO). Through the collection of six million Tweets for one month, this study identified the prominent topics of users as it relates to the negative sentiments. Our proposed framework uses two text mining methods, sentiment analysis and topic modeling, to discover negative topics. The negative sentiments of Twitter users support the literature narratives and the many morbidity issues that are associated with DDEO and the linkage between obesity and diabetes. The framework offers a potential method to understand the publics' opinions and sentiments regarding DDEO. More importantly, this research provides new opportunities for computational social scientists, medical experts and public health professionals to collectively address DDEO-related issues.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.