Estimates of genetic diversity represent a valuable resource for biodiversity assessments and are increasingly used to guide conservation and management programs. The most commonly reported estimates of DNA sequence diversity in animal populations are haplotype diversity (h) and nucleotide diversity (p) for the mitochondrial gene cytochrome c oxidase subunit I (cox1). However, several issues relevant to the comparison of h and p within and between studies remain to be assessed. We used population-level cox1 data from peer-reviewed publications to quantify the extent to which data sets can be re-assembled, to provide a standardized summary of h and p estimates, to explore the relationship between these metrics and to assess their sensitivity to under-sampling. Only 19 out of 42 selected publications had archived data that could be unambiguously re-assembled; this comprised 127 population-level data sets (nX15) from 23 animal species. Estimates of h and p were calculated using a 456-base region of cox1 that was common to all the data sets (median h¼0.70130, median p¼0.00356). Non-linear regression methods and Bayesian information criterion analysis revealed that the most parsimonious model describing the relationship between the estimates of h and p was p¼0.0081h 2 . Deviations from this model can be used to detect outliers due to biological processes or methodological issues. Subsampling analyses indicated that samples of n45 were sufficient to discriminate extremes of high from low population-level cox1 diversity, but samples of nX25 are recommended for greater accuracy.