The purpose of this review is to clarify the concepts of bias, precision and accuracy as they are commonly defined in the biostatistical literature, with our focus on the use of these concepts in quantitatively testing the performance of point estimators (specifically species richness estimators). We first describe the general concepts underlying bias, precision and accuracy, and then describe a number of commonly used unscaled and scaled performance measures of bias, precision and accuracy (e.g. mean error, variance, standard deviation, mean square error, root mean square error, mean absolute error, and all their scaled counterparts) which may be used to evaluate estimator performance. We also provide mathematical formulas and a worked example for most performance measures. Since every measure of estimator performance should be viewed as suggestive, not prescriptive, we also mention several other performance measures that have been used by biostatisticians or ecologists. We then outline several guidelines of how to test the performance of species richness estimators: the detailed description of data simulation models and resampling schemes, the use of real and simulated data sets on as many different estimators as possible, mathematical expressions for all estimators and performance measures, and the presentation of results for each scaled performance measure in numerical tables with increasing levels of sampling effort. We finish with a literature review of promising new research related to species richness estimation, and summarize the results of 14 studies that compared estimator performance, which confirm that with most data sets, non‐parametric estimators (mostly the Chao and jackknife estimators) perform better than other estimators, e.g. curve models or fitting species‐abundance distributions.
In most real-world contexts the sampling effort needed to attain an accurate estimate of total species richness is excessive. Therefore, methods to estimate total species richness from incomplete collections need to be developed and tested. Using real and computer-simulated parasite data sets, the performances of 9 species richness estimation methods were compared. For all data sets, each estimation method was used to calculate the projected species richness at increasing levels of sampling effort. The performance of each method was evaluated by calculating the bias and precision of its estimates against the known total species richness. Performance was evaluated with increasing sampling effort and across different model communities. For the real data sets, the Chao2 and first-order jackknife estimators performed best. For the simulated data sets, the first-order jackknife estimator performed best at low sampling effort but, with increasing sampling effort, the bootstrap estimator outperformed all other estimators. Estimator performance increased with increasing species richness, aggregation level of individuals among samples and overall population size. Overall, the Chao2 and the first-order jackknife estimation methods performed best and should be used to control for the confounding effects of sampling effort in studies of parasite species richness. Potential uses of and practical problems with species richness estimation methods are discussed.
Recent studies have provided evolutionary explanations for much of the variation in mortality among human infectious diseases. One gap in this knowledge concerns respiratory tract pathogens transmitted from person to person by direct contact or through environmental contamination. The sit-and-wait hypothesis predicts that virulence should be positively correlated with durability in the external environment because high durability reduces the dependence of transmission on host mobility. Reviewing the epidemiological and medical literature, we confirm this prediction for respiratory tract pathogens of humans. Our results clearly distinguish a high-virulence high-survival group of variola (smallpox) virus, Mycobacterium tuberculosis, Cornynebacterium diphtheriae, Bordetella pertussis, Streptococcus pneumoniae, and influenza virus (where all pathogens have a mean percent mortality > or = 0.01% and mean survival time >10 days) from a low-virulence low-survival group containing ten other pathogens. The correlation between virulence and durability explains three to four times of magnitude of difference in mean percent mortality and mean survival time, using both across-species and phylogenetically controlled analyses. Our findings bear on several areas of active research and public health policy: (1) many pathogens used in the biological control of insects are potential sit-and-wait pathogens as they combine three attributes that are advantageous for pest control: high virulence, long durability after application, and host specificity; (2) emerging pathogens such as the 'hospital superbug' methicillin-resistant Staphylococcus aureus (MRSA) and potential bioweapons pathogens such as smallpox virus and anthrax that are particularly dangerous can be discerned by quantifying their durability; (3) hospital settings and the AIDS pandemic may provide footholds for emerging sit-and-wait pathogens; and (4) studies on food-borne and insect pathogens point to future research considering the potential evolutionary trade-offs and genetic linkages between virulence and durability.
It is difficult to accurately estimate species richness if there are many almost undetectable species in a hyper-diverse community. Practically, an accurate lower bound for species richness is preferable to an inaccurate point estimator. The traditional nonparametric lower bound developed by Chao (1984, Scandinavian Journal of Statistics 11, 265-270) for individual-based abundance data uses only the information on the rarest species (the numbers of singletons and doubletons) to estimate the number of undetected species in samples. Applying a modified Good-Turing frequency formula, we derive an approximate formula for the first-order bias of this traditional lower bound. The approximate bias is estimated by using additional information (namely, the numbers of tripletons and quadrupletons). This approximate bias can be corrected, and an improved lower bound is thus obtained. The proposed lower bound is nonparametric in the sense that it is universally valid for any species abundance distribution. A similar type of improved lower bound can be derived for incidence data. We test our proposed lower bounds on simulated data sets generated from various species abundance models. Simulation results show that the proposed lower bounds always reduce bias over the traditional lower bounds and improve accuracy (as measured by mean squared error) when the heterogeneity of species abundances is relatively high. We also apply the proposed new lower bounds to real data for illustration and for comparisons with previously developed estimators.
Background: Benzodiazepines are a widely used medication in developed countries, particularly among elderly patients. However, benzodiazepines are known to affect memory and cognition and might thus enhance the risk of dementia. The objective of this review is to synthesize evidence from observational studies that evaluated the association between benzodiazepines use and dementia risk. Summary: We performed a systematic review and meta-analysis of controlled observational studies to evaluate the risk of benzodiazepines use on dementia outcome. All control observational studies that compared dementia outcome in patients with benzodiazepine use with a control group were included. We calculated pooled ORs using a random-effects model. Ten studies (of 3,696 studies identified) were included in the systematic review, of which 8 studies were included in random-effects meta-analysis and sensitivity analyses. Odds of dementia were 78% higher in those who used benzodiazepines compared with those who did not use benzodiazepines (OR 1.78; 95% CI 1.33-2.38). In subgroup analysis, the higher association was still found in the studies from Asia (OR 2.40; 95% CI 1.66-3.47) whereas a moderate association was observed in the studies from North America and Europe (OR 1.49; 95% CI 1.34-1.65 and OR 1.43; 95% CI 1.16-1.75). Also, diabetics, hypertension, cardiac disease, and statin drugs were associated with increased risk of dementia but negative association was observed in the case of body mass index. There was significant statistical and clinical heterogeneity among studies for the main analysis and most of the sensitivity analyses. There was significant statistical and clinical heterogeneity among the studies for the main analysis and most of the sensitivity analyses. Key Messages: Our results suggest that benzodiazepine use is significantly associated with dementia risk. However, observational studies cannot clarify whether the observed epidemiologic association is a causal effect or the result of some unmeasured confounding variable. Therefore, more research is needed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.