BioNLP 2017 2017
DOI: 10.18653/v1/w17-2316
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Personal Medication Intake in Twitter: An Annotated Corpus and Baseline Classification System

Abstract: Social media sites (e.g., Twitter) have been used for surveillance of drug safety at the population level, but studies that focus on the effects of medications on specific sets of individuals have had to rely on other sources of data. Mining social media data for this information would require the ability to distinguish indications of personal medication intake in this media. Towards that end, this paper presents an annotated corpus that can be used to train machine learning systems to determine whether a twee… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
37
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 34 publications
(37 citation statements)
references
References 19 publications
(23 reference statements)
0
37
0
Order By: Relevance
“…We collected all the data from Twitter via the public streaming API, using generic and trade names for medications, along with their common misspellings, totaling over 250 keywords. For subtasks-1 and -2, the annotated datasets for training were made available to the public with our prior publications, 21 , 27 , 28 while subtask-3 included previously unpublished data. Evaluation data were not made public at the time of the workshop.…”
Section: Methodsmentioning
confidence: 99%
“…We collected all the data from Twitter via the public streaming API, using generic and trade names for medications, along with their common misspellings, totaling over 250 keywords. For subtasks-1 and -2, the annotated datasets for training were made available to the public with our prior publications, 21 , 27 , 28 while subtask-3 included previously unpublished data. Evaluation data were not made public at the time of the workshop.…”
Section: Methodsmentioning
confidence: 99%
“…Since there are no trainable datasets that we could make use of, we created a dataset utilizing annotated datasets from different sources [18][19][20][21] . We emphasize that we did not annotate or create any annotated set of tweets ourselves.…”
Section: Classificationmentioning
confidence: 99%
“…We collected 259,042 tweets that only have drug strings from multiple papers on pharmacovigilance using social media [18][19][20][21] and downloaded all the tweets available through them. These tweets were annotated by different annotators as part of their research.…”
Section: Classical Modelsmentioning
confidence: 99%
“…In the health sector, twitter was used, for example, to monitor and predict the spread of influenza [3]- [5]. It was also used to monitor the adverse effect of medications in [6] and [7], track medication adherence in [8] and for the understanding of the well-being of military populations in [9].…”
Section: Related Workmentioning
confidence: 99%
“…Detecting and subsequently eliminating the irrelevant tweets can be achieved by using a classifier. Several classifiers were used in [8] for the study of medication adherence using Twitter. These classifiers include Bayesian networks, random forests, logistic regression, and support vector machines (SVM).…”
Section: Related Workmentioning
confidence: 99%