Petrov, 2001), yet the underlying processes for this variability are not yet fully understood (Elliott & Gregory, 2015). To understand and study the mechanisms of genome size variation, such as proliferation of repetitive elements (Blommaert et al., 2019), effective population size (Lefébure et al., 2017;Lynch & Conery, 2003) or correlation to other traits (Gardner et al., 2020;Prokopowich et al., 2003), reliable estimates for the taxon under scrutiny are therefore mandatory. This is all the more important as substantial changes in genome size may even occur among closely related sister species, that is over relatively short evolutionary timescales (Agudo et al., 2019;Keyl, 1965;Vitales et al., 2020). An accurate estimation of genome size is also important for genomic projects. For example, in the assembly of genomes, the proportion of the true genome size covered by a given assembly draft is a quality criterion and limits the maximum size of the draft. In addition, resequencing projects requiring a certain sequencing depth (e.g., for genotyping) profit from a reliable genome size estimate (Fountain et al., 2016).Flow cytometry is generally deemed to yield reliable estimates of genome size (Doležel & Greilhuber, 2010;Johnston et al., 2019). Yet, this method is not without caveats (Wang et al., 2015) and requires specialized laboratory skills and availability of the relatively expensive equipment. Moreover, the method depends on the availability of fresh or frozen tissue with largely intact cells, which narrows the range of taxa for which such analyses are practically feasible (Johnston et al., 2019).
Bioinformatical analysis of next generation sequencing (NGS) dataprovides an alternative for estimating genome size (Vurture et al., 2017). Besides the widely used k-mer-based methods (Li & Waterman, 2003;Lipovský et al., 2017), Schell et al. (2017 introduced a very simple method for genome size estimation, relying on mapping statistics