Background Mammalian centromeres are satellite-rich chromatin domains that execute conserved roles in kinetochore assembly and chromosome segregation. Centromere satellites evolve rapidly between species, but little is known about population-level diversity across these loci. Results We developed a k-mer based method to quantify centromere copy number and sequence variation from whole genome sequencing data. We applied this method to diverse inbred and wild house mouse (Mus musculus) genomes to profile diversity across the core centromere (minor) satellite and the pericentromeric (major) satellite repeat. We show that minor satellite copy number varies more than 10-fold among inbred mouse strains, whereas major satellite copy numbers span a 3-fold range. In contrast to widely held assumptions about the homogeneity of mouse centromere repeats, we uncover marked satellite sequence heterogeneity within single genomes, with diversity levels across the minor satellite exceeding those at the major satellite. Analyses in wild-caught mice implicate subspecies and population origin as significant determinants of variation in satellite copy number and satellite heterogeneity. Intriguingly, we also find that wild-caught mice harbor dramatically reduced minor satellite copy number and elevated satellite sequence heterogeneity compared to inbred strains, suggesting that inbreeding may reshape centromere architecture in pronounced ways. Conclusion Taken together, our results highlight the power of k-mer based approaches for probing variation across repetitive regions, provide an initial portrait of centromere variation across Mus musculus, and lay the groundwork for future functional studies on the consequences of natural genetic variation at these essential chromatin domains.
Centromeres are satellite-rich chromatin domains that are essential for chromosome segregation. Centromere satellites evolve rapidly between species but little is known about population-level diversity across these loci. We developed a k-mer based method to quantify centromere copy number and sequence variation from whole genome sequencing data. We applied this method to diverse inbred and wild house mouse (genus Mus) genomes and uncover pronounced variation in centromere architecture between strains and populations. We show that patterns of centromere diversity do not mirror the known ancestry of inbred strains, revealing a remarkably rapid rate of centromere sequence evolution. We document increased satellite homogeneity and copy number in inbred compared to wild mice, suggesting that inbreeding remodels mouse centromere architecture. Our results highlight the power of k-mer based approaches for probing variation across repetitive regions and provide the first in-depth, phylogenetic portrait of centromere variation across Mus musculus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.