Since 2013, the French Animal Health Epidemic Intelligence System (in French: Veille Sanitaire Internationale, VSI) has been monitoring signals of the emergence of new and exotic animal infectious diseases worldwide. Once detected, the VSI team verifies the signals and issues early warning reports to French animal health authorities when potential threats to France are detected. To improve detection of signals from online news sources, we designed the Platform for Automated extraction of Disease Information from the web (PADI-web). PADI-web automatically collects, processes and extracts English-language epidemiological information from Google News. The core component of PADI-web is a combined information extraction (IE) method founded on rule-based systems and data mining techniques. The IE approach allows extraction of key information on diseases, locations, dates, hosts and the number of cases mentioned in the news. We evaluated the combined method for IE on a dataset of 352 disease-related news reports mentioning the diseases involved, locations, dates, hosts and the number of cases. The combined method for IE accurately identified (F-score) 95% of the diseases and hosts, respectively, 85% of the number of cases, 83% of dates and 80% of locations from the disease-related news. We assessed the sensitivity of PADI-web to detect primary outbreaks of four emerging animal infectious diseases notifiable to the World Organisation for Animal Health (OIE). From January to June 2016, PADI-web detected signals for 64% of all primary outbreaks of African swine fever, 53% of avian influenza, 25% of bluetongue and 19% of foot-and-mouth disease. PADI-web timely detected primary outbreaks of avian influenza and foot-and-mouth disease in Asia, i.e. they were detected 8 and 3 days before immediate notification to OIE, respectively.
PADI-web (Platform for Automated extraction of animal Disease Information from the web) is a biosurveillance system dedicated to monitoring online news sources for the detection of emerging animal infectious diseases. PADI-web has collected more than 380,000 news articles since 2016. Compared to other existing biosurveillance tools, PADI-web focuses specifically on animal health and has a fully automated pipeline based on machine-learning methods. This paper presents the new functionalities of PADI-web based on the integration of: (i) a new fine-grained classification system, (ii) automatic methods to extract terms and named entities with text-mining approaches, (iii) semantic resources for indexing keywords and (iv) a notification system for end-users. Compared to other biosurveillance tools, PADI-web, which is integrated in the French Platform for Animal Health Surveillance (ESA Platform), offers strong coverage of the animal sector, a multilingual approach, an automated information extraction module and a notification tool configurable according to end-user needs.
Event‐based surveillance (EBS) systems monitor a broad range of information sources to detect early signals of disease emergence, including new and unknown diseases. In December 2019, a newly identified coronavirus emerged in Wuhan (China), causing a global coronavirus disease (COVID‐19) pandemic. A retrospective study was conducted to evaluate the capacity of three event‐based surveillance (EBS) systems (ProMED, HealthMap and PADI‐web) to detect early COVID‐19 emergence signals. We focused on changes in online news vocabulary over the period before/after the identification of COVID‐19, while also assessing its contagiousness and pandemic potential. ProMED was the timeliest EBS, detecting signals one day before the official notification. At this early stage, the specific vocabulary used was related to ‘pneumonia symptoms’ and ‘mystery illness’. Once COVID‐19 was identified, the vocabulary changed to virus family and specific COVID‐19 acronyms. Our results suggest that the three EBS systems are complementary regarding data sources, and all require timeliness improvements. EBS methods should be adapted to the different stages of disease emergence to enhance early detection of future unknown disease outbreaks.
Event-based surveillance (EBS) gathers information from a variety of data sources, including online news articles. Unlike the data from formal reporting, the EBS data are not structured, and their interpretation can overwhelm epidemic intelligence (EI) capacities in terms of available human resources. Therefore, diverse EBS systems that automatically process (all or part of) the acquired nonstructured data from online news articles have been developed. These EBS systems (e.g., GPHIN, HealthMap, MedISys, ProMED, PADI-web) can use annotated data to improve the surveillance systems. This paper describes a framework for the annotation of epidemiological information in animal disease-related news articles. We provide annotation guidelines that are generic and applicable to both animal and zoonotic infectious diseases, regardless of the pathogen involved or its mode of transmission (e.g., vector-borne, airborne, by contact). The framework relies on the successive annotation of all the sentences from a news article. The annotator evaluates the sentences in a specific epidemiological context, corresponding to the publication date of the news article.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.