2008
DOI: 10.1093/bioinformatics/btn534
|View full text |Cite
|
Sign up to set email alerts
|

BioCaster: detecting public health rumors with a Web-based text mining system

Abstract: Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman's terms and formal-coding systems is contained in the freely available BioCaster ontology which includes i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
147
0

Year Published

2009
2009
2019
2019

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 203 publications
(147 citation statements)
references
References 3 publications
(3 reference statements)
0
147
0
Order By: Relevance
“…For example, we exclude from detailed analysis work that provides only alerts [39,40], measures public perception of a disease [41], includes disease dynamics in its model [42], evaluates a third-party method [43], uses non-single-source data feeds [39,44], or crowd-sources health-related data (participatory disease surveillance) [45,46]. We also focus on work that estimates biologically-rooted metrics.…”
Section: Author Summarymentioning
confidence: 99%
“…For example, we exclude from detailed analysis work that provides only alerts [39,40], measures public perception of a disease [41], includes disease dynamics in its model [42], evaluates a third-party method [43], uses non-single-source data feeds [39,44], or crowd-sources health-related data (participatory disease surveillance) [45,46]. We also focus on work that estimates biologically-rooted metrics.…”
Section: Author Summarymentioning
confidence: 99%
“…There are encouraging signs, however, that informal data can be used reliably, as preliminary experiments from DIZIE's development team have verified that respiratory data in the U.S. from Twitter correlates well with data from the CDC [70]. Second, challenges arise in extracting and integrating information from data in -different file formats, schemas, naming systems‖ [71] and languages, and are subjects being discussed in several studies [72,73]. Third, existing ethical issues are rarely explored in passive data collection (e.g., from Twitter).…”
Section: Health Data Collectionmentioning
confidence: 99%
“…Different approaches are based on the extraction of information available in Web documents (news, reports, and so forth) in order to predict knowledge [4][5][6].…”
Section: State-of-the-artmentioning
confidence: 99%