The application of serial principled sampling designs for diagnostic testing is often viewed as an ideal approach to monitoring prevalence and case counts of infectious or chronic diseases. Considering logistics and the need for timeliness and conservation of resources, surveillance efforts can generally benefit from creative designs and accompanying statistical methods to improve the precision of sampling-based estimates and reduce the size of the necessary sample. One option is to augment the analysis with available data from other surveillance streams that identify cases from the population of interest over the same timeframe, but may do so in a highly nonrepresentative manner. We consider monitoring a closed population (e.g., a long-term care facility, patient registry, or community), and encourage the use of capture–recapture methodology to produce an alternative case total estimate to the one obtained by principled sampling. With care in its implementation, even a relatively small simple or stratified random sample not only provides its own valid estimate, but provides the only fully defensible means of justifying a second estimate based on classical capture–recapture methods. We initially propose weighted averaging of the two estimators to achieve greater precision than can be obtained using either alone, and then show how a novel single capture–recapture estimator provides a unified and preferable alternative. We develop a variant on a Dirichlet-multinomial-based credible interval to accompany our hybrid design-based case count estimates, with a view toward improved coverage properties. Finally, we demonstrate the benefits of the approach through simulations designed to mimic an acute infectious disease daily monitoring program or an annual surveillance program to quantify new cases within a fixed patient registry.
In the United States, COVID-19 has become a leading cause of death since 2020. However, the number of COVID-19 deaths reported from death certificates is likely to represent an underestimate of the total deaths related to SARS-CoV-2 infections. Estimating those deaths not captured through death certificates is important to understanding the full burden of COVID-19 on mortality. In this work, we explored enhancements to an existing approach by employing Bayesian hierarchical models to estimate unrecognized deaths attributed to COVID-19 using weekly state-level COVID-19 viral surveillance and mortality data in the United States from March 2020 to April 2021. We demonstrated our model using those aged years who died. First, we used a spatial–temporal binomial regression model to estimate the percent of positive SARS-CoV-2 test results. A spatial–temporal negative-binomial model was then used to estimate unrecognized COVID-19 deaths by exploiting the spatial–temporal association between SARS-CoV-2 percent positive and all-cause mortality counts using an excess mortality approach. Computationally efficient Bayesian inference was accomplished via the Polya-Gamma representation of the binomial and negative-binomial models. Among those aged years, we estimated 58,200 (95% CI: 51,300, 64,900) unrecognized COVID-19 deaths, which accounts for 26% (95% CI: 24%, 29%) of total COVID-19 deaths in this age group. Our modeling results suggest that COVID-19 mortality and the proportion of unrecognized deaths among deaths attributed to COVID-19 vary by time and across states.
Capture-recapture methods are widely applied in estimating the number (N) of prevalent or cumulatively incident cases in disease surveillance. Here, we focus the bulk of our attention on the common case in which there are two data streams. We propose a sensitivity and uncertainty analysis framework grounded in multinomial distribution-based maximum likelihood, hinging on a key dependence parameter that is typically non-identifiable but is epidemiologically interpretable. Focusing on the epidemiologically meaningful parameter unlocks appealing data visualizations for sensitivity analysis and provides an intuitively accessible framework for uncertainty analysis designed to leverage the practicing epidemiologist's understanding of the implementation of the surveillance streams as the basis for assumptions driving estimation of N. By illustrating the proposed sensitivity analysis using publicly available HIV surveillance data, we emphasize both the need to admit the lack of information in the observed data and the appeal of incorporating expert opinion about the key dependence parameter. The proposed uncertainty analysis is an empirical Bayes-like approach designed to more realistically acknowledge variability in the estimated N associated with uncertainty in an expert's opinion about the non-identifiable parameter, together with the statistical uncertainty. We demonstrate how such an approach can also facilitate an appealing general interval estimation procedure to accompany capture-recapture methods. Simulation studies illustrate the reliable performance of the proposed approach for quantifying uncertainties in estimating N in various contexts. Finally, we demonstrate how the recommended paradigm has the potential to be directly extended for application to data from more than two surveillance streams.
With the aid of laboratory typing techniques, infectious disease surveillance networks have the opportunity to obtain powerful information on the emergence, circulation, and evolution of multiple genotypes, serotypes or other subtypes of pathogens, informing understanding of transmission dynamics and strategies for prevention and control. The volume of typing performed on clinical isolates is typically limited by its ability to inform clinical care, cost and logistical constraints, especially in comparison with the capacity to monitor clinical reports of disease occurrence, which remains the most widespread form of public health surveillance. Viewing clinical disease reports as arising from a latent mixture of pathogen subtypes, laboratory typing of a subset of clinical cases can provide inference on the proportion of clinical cases attributable to each subtype (i.e., the mixture components). Optimizing protocols for the selection of isolates for typing by weighting specific subpopulations, locations, time periods, or case characteristics (e.g., disease severity), may improve inference of the frequency and distribution of pathogen subtypes within and between populations. Here, we apply the Disease Surveillance Informatics Optimization and Simulation (DIOS) framework to simulate and optimize hand foot and mouth disease (HFMD) surveillance in a high-burden region of western China. We identify laboratory surveillance designs that significantly outperform the existing network: the optimal network reduced mean absolute error in estimated serotype-specific incidence rates by 14.1%; similarly, the optimal network for monitoring severe cases reduced mean absolute error in serotype-specific incidence rates by 13.3%. In both cases, the optimal network designs achieved improved inference without increasing subtyping effort. We demonstrate how the DIOS framework can be used to optimize surveillance networks by augmenting clinical diagnostic data with limited laboratory typing resources, while adapting to specific, local surveillance objectives and constraints.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.