2019
DOI: 10.1101/859611
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Mining Archive.org’s Twitter Stream Grab for Pharmacovigilance Research Gold

Abstract: In the last few years Twitter has become an important resource for the identification of Adverse Drug Reactions (ADRs), monitoring flu trends, and other pharmacovigilance and general research applications. Most researchers spend their time crawling Twitter, buying expensive pre-mined datasets, or tediously and slowly building datasets using the limited Twitter API. However, there are a large number of datasets that are publicly available to researchers which are underutilized or unused. In this work, we demons… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
17
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 10 publications
(17 citation statements)
references
References 26 publications
0
17
0
Order By: Relevance
“…https://doi.org/10.5808/GI.2020.18.2.e16 uses like Pharmacovigilance [11] among others. This initial version release of SMMT will continue growing with additional tools being developed for platforms like Reddit, Dark Web forums, and other social media data sources.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…https://doi.org/10.5808/GI.2020.18.2.e16 uses like Pharmacovigilance [11] among others. This initial version release of SMMT will continue growing with additional tools being developed for platforms like Reddit, Dark Web forums, and other social media data sources.…”
Section: Discussionmentioning
confidence: 99%
“…The tools part of SMMT allow users to simplify their research workflows and to focus on determining which data they want to use and the analyses they want to perform, rather than deciphering how to acquire the data. While most cutting-edge and near real-time research will be done pulling tweets from the Twitter API stream, there are countless datasets available for historical research, from large general purpose databases like the Internet Archive’s Twitter Stream Grab dataset [ 23 ], which consists of data from 2014 to 2019, to more specialized and pre-curated datasets for uses like Pharmacovigilance [ 11 ] among others. This initial version release of SMMT will continue growing with additional tools being developed for platforms like Reddit, Dark Web forums, and other social media data sources.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…We utilized the largest available Covid-19 dataset (Banda et al, 2020) curated using a Social Media Mining Toolkit (SMMT) (Tekumalla and Banda, 2020b). Version 15 of the Covid-19 dataset was utilized for our experiments since it was the latest released version at the time of experiments.…”
Section: Introductionmentioning
confidence: 99%
“…This dataset consists of tweets related to COVID-19 from January 1, 2020 to June 20, 2020. We automatically tagged ~424 million tweets using a drug dictionary compiled from RxNorm (National Library of Medicine, 2008) with 19,643 terms and validated in (Tekumalla et al, 2020) and (Tekumalla and Banda, 2020a).…”
Section: Introductionmentioning
confidence: 99%