The design and interpretation of prevalence studies rely on point estimates of the performance characteristics of the diagnostic test used. When the test characteristics are not well defined and a limited number of tests are available, such as during an outbreak of a novel pathogen, tests can be used either for the field study itself or for additional validation to reduce uncertainty in the test characteristics. Because field data and validation data are based on finite samples, inferences drawn from these data carry uncertainty. In the absence of a framework to balance those uncertainties during study design, it is unclear how best to distribute tests to improve study estimates. Here, we address this gap by introducing a joint Bayesian model to simultaneously analyze lab validation and field survey data. In many scenarios, prevalence estimates can be most improved by apportioning additional effort towards validation rather than to the field. We show that a joint model provides superior estimation of prevalence, as well as sensitivity and specificity, compared with typical analyses that model lab and field data separately.Prevalence is traditionally estimated by analyzing the outcomes from diagnostic tests given to a subset of the population. During analysis of these outcomes, the sensitivity and specificity of the test, as well as the number of samples in the survey, are incorporated into point estimates and uncertainty bounds for the true prevalence. In many cases, sensitivity and specificity are taken to be fixed characteristics of the test [1, 2]. However, sensitivity and specificity are themselves estimated from test outcomes in validation studies. As a result, they, too, carry statistical uncertainty, and that statistical uncertainty should be carried forward into estimates of prevalence [3,4]. Since prevalence estimates may improve as sample size increases and with reduced uncertainty in the test characteristics, a fundamental study design question arises: given a fixed number of tests, how should one allocate them between the field and validation lab?Here, we derive a Bayesian joint posterior distribution for prevalence and test sensitivity and specificity based on sampling models for both the field survey data and validation data. While others have demonstrated how to estimate prevalence from this model [4][5][6], we highlight the utility of this model for addressing the problem of how to allocate a fixed number of tests between the field and the lab to produce the best prevalence estimates. We demonstrate that, when the sensitivity and specificity of a test have not yet been well established, that the largest improvement in prevalence estimates could result from allocating samples to test validation rather than to the survey. Finally, we showcase how this joint model can produce improved estimates of sensitivity and specificity compared to models based only on the lab data.
METHODSOur goal is to estimate population seroprevalence (θ), test sensitivity (se), and test specificity (sp) by learning from the...