Programmed DNA elimination is a developmentally regulated process leading to the reproducible loss of specific genomic sequences. DNA elimination occurs in unicellular ciliates and a variety of metazoans, including invertebrates and vertebrates. In metazoa, DNA elimination typically occurs in somatic cells during early development, leaving the germline genome intact. Reference genomes for metazoa that undergo DNA elimination are not available. Here, we generated germline and somatic reference genome sequences of the DNA eliminating pig parasitic nematode and the horse parasite In addition, we carried out in-depth analyses of DNA elimination in the parasitic nematode of humans, and the parasitic nematode of dogs,s. Our analysis of nematode DNA elimination reveals that in all species, repetitive sequences (that differ among the genera) and germline-expressed genes (approximately 1000-2000 or 5%-10% of the genes) are eliminated. Thirty-five percent of these eliminated genes are conserved among these nematodes, defining a core set of eliminated genes that are preferentially expressed during spermatogenesis. Our analysis supports the view that DNA elimination in nematodes silences germline-expressed genes. Over half of the chromosome break sites are conserved between and, whereas only 10% are conserved in the more divergent Analysis of the chromosomal breakage regions suggests a sequence-independent mechanism for DNA breakage followed by telomere healing, with the formation of more accessible chromatin in the break regions prior to DNA elimination. Our genome assemblies and annotations also provide comprehensive resources for analysis of DNA elimination, parasitology research, and comparative nematode genome and epigenome studies.
Despite tremendous progress in genome sequencing, the basic goal of producing phased (haplotype-resolved) genome sequence with end-to-end contiguity for each chromosome at reasonable cost and effort is still unrealized. In this study, we describe a new approach to perform de novo genome assembly and experimental phasing by integrating the data from Illumina shortread sequencing, 10X Genomics Linked-Read sequencing, and BioNano Genomics genome mapping to yield a high-quality, phased, de novo assembled human genome.The completion of the human genome reference assembly in 2003 marked a major milestone in genome research. The reference human genome sequence (and the genome sequences of Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms Correspondence should be addressed to P.Y.K. (Pui.Kwok@ucsf.edu). Accession codes. Sequencing and assembly data are available under BioProject PRJNA315896 with Sequence Read Archive accession numbers: SRX1675529, SRX1675530 and SRX1675531.Author Contributions P.Y.K., J.D.W., and Y.M. conceived the project and provided resources and oversight for sequencing and algorithmic analysis. K.G. prepared long libraries for 10XG GemCode sequencing. C.C. and C.L. performed long DNA preparation and BNG genome mapping experiments. E.T.L., A.R.H., Ž. DŽ., J. Lee, and H.C. built initial genome maps and performed BNG alignment and SV calling. Y.M. and J. Lam performed scaffold analysis. E.T.L., A.R.H., and J. Lee performed hybrid genome assembly. P.M., K.G., and M.S.L. performed scaffold phasing. Y.M., M.L.S., E.T.L., J. Lam, J. Lee, and S.A.S. performed validation and quality measure analyses of the assembled data. Y.M., E.T.L., M.L.S., and P.Y.K. primarily wrote the manuscript and revisions, though many coauthors provided edits and methods sections.Competing Financial Interests Statement E.T.L., A.R.H., J. Lee, Ž. DŽ., H.C. are employees of BioNano Genomics. P.M., K.G., M.S.L. are employees of 10X Genomics, and P.Y.K. is on the scientific advisory board of BioNano Genomics. HHS Public Access Author ManuscriptAuthor Manuscript Author ManuscriptAuthor Manuscript numerous other organisms) and the sequencing technologies developed for the Human Genome Project revolutionized biological research and hastened the discovery of causal mutations for many diseases 1,2 . Despite tremendous progress, the basic goal of producing phased (haplotype-resolved) genome sequence with end-to-end contiguity for each chromosome at reasonable cost and effort is still unrealized. Consequently, researchers who engage in human "whole-genome sequencing" have produced tens of thousands of genomes that are collections of short-read sequences aligned to the composite reference human genome sequence produced from several donors of various ethnic backgrounds. Similarly, de novo assemblies of other species generally consist of a set ...
The search to understand how genomes innovate in response to selection dominates the field of evolutionary biology. Powerful molecular evolution approaches have been developed to test individual loci for signatures of selection. In many cases, however, an organism's response to changes in selective pressure may be mediated by multiple genes, whose products function together in a cellular process or pathway. Here we assess the prevalence of polygenic evolution in pathways in the yeasts Saccharomyces cerevisiae and S. bayanus. We first established short-read sequencing methods to detect cis-regulatory variation in a diploid hybrid between the species. We then tested for the scenario in which selective pressure in one species to increase or decrease the activity of a pathway has driven the accumulation of cis-regulatory variants that act in the same direction on gene expression. Application of this test revealed a variety of yeast pathways with evidence for directional regulatory evolution. In parallel, we also used population genomic sequencing data to compare protein and cis-regulatory variation within and between species. We identified pathways with evidence for divergence within S. cerevisiae, and we detected signatures of positive selection between S. cerevisiae and S. bayanus. Our results point to polygenic, pathway-level change as a common evolutionary mechanism among yeasts. We suggest that pathway analyses, including our test for directional regulatory evolution, will prove to be a relevant and powerful strategy in many evolutionary genomic applications.A main challenge of evolutionary biology is to understand the influence of selection on genetic variation within and between species. Classically, molecular evolution methods have targeted individual genes or loci for tests of selection. Their successes have uncovered both protein-coding and regulatory variants with signatures of non-neutral evolution (1-3). However, decades of quantitative genetic mapping studies indicate that in most cases, variation in phenotype between individuals is the result of multiple sequence changes at unlinked loci (4). The genetic response to changes in selective pressure is likely to follow the same pattern, but, to date, methods for identifying cases of polygenic evolution have been at a premium.Some of the strongest evidence for polygenic adaptation has emerged from the study of protein-coding variants between phylogenetic lineages. Signatures of positive selection can be detected in the protein-coding sequences of groups of genes of related function (5-8), suggestive of a coherent series of genetic changes accumulated by a population in response to selection. Additionally, genetic variation in gene expression represents a rich data source for signatures of selection, both positive (9, 10) and purifying (11,12), although the mechanisms that govern regulatory evolution are not fully understood. Regulatory variants can act in cis, to impact the expression of a neighboring gene, or in trans, targeting the expression of genes el...
Large structural variants (SVs) in the human genome are difficult to detect and study by conventional sequencing technologies. With long-range genome analysis platforms, such as optical mapping, one can identify large SVs (>2 kb) across the genome in one experiment. Analyzing optical genome maps of 154 individuals from the 26 populations sequenced in the 1000 Genomes Project, we find that phylogenetic population patterns of large SVs are similar to those of single nucleotide variations in 86% of the human genome, while ~2% of the genome has high structural complexity. We are able to characterize SVs in many intractable regions of the genome, including segmental duplications and subtelomeric, pericentromeric, and acrocentric areas. In addition, we discover ~60 Mb of non-redundant genome content missing in the reference genome sequence assembly. Our results highlight the need for a comprehensive set of alternate haplotypes from different populations to represent SV patterns in the genome.
Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation.
Low copy repeats (LCRs) are recognized as a significant source of genomic instability, driving genome variability and evolution. The Chromosome 22 LCRs (LCR22s) mediate nonallelic homologous recombination (NAHR) leading to the 22q11 deletion syndrome (22q11DS). However, LCR22s are among the most complex regions in the genome, and their structure remains unresolved. The difficulty in generating accurate maps of LCR22s has also hindered localization of the deletion end points in 22q11DS patients. Using fiber FISH and Bionano optical mapping, we assembled LCR22 alleles in 187 cell lines. Our analysis uncovered an unprecedented level of variation in LCR22s, including LCR22A alleles ranging in size from 250 to 2000 kb. Further, the incidence of various LCR22 alleles varied within different populations. Additionally, the analysis of LCR22s in 22q11DS patients and their parents enabled further refinement of the rearrangement site within LCR22A and-D, which flank the 22q11 deletion. The NAHR site was localized to a 160-kb paralog shared between the LCR22A and-D in seven 22q11DS patients. Thus, we present the most comprehensive map of LCR22 variation to date. This will greatly facilitate the investigation of the role of LCR variation as a driver of 22q11 rearrangements and the phenotypic variability among 22q11DS patients.
One of the outstanding challenges in comparative genomics is to interpret the evolutionary importance of regulatory variation between species. Rigorous molecular evolution-based methods to infer evidence for natural selection from expression data are at a premium in the field, and to date, phylogenetic approaches have not been well-suited to address the question in the small sets of taxa profiled in standard surveys of gene expression. We have developed a strategy to infer evolutionary histories from expression profiles by analyzing suites of genes of common function. In a manner conceptually similar to molecular evolution models in which the evolutionary rates of DNA sequence at multiple loci follow a gamma distribution, we modeled expression of the genes of an a priori-defined pathway with rates drawn from an inverse gamma distribution. We then developed a fitting strategy to infer the parameters of this distribution from expression measurements, and to identify gene groups whose expression patterns were consistent with evolutionary constraint or rapid evolution in particular species. Simulations confirmed the power and accuracy of our inference method. As an experimental testbed for our approach, we generated and analyzed transcriptional profiles of four Saccharomyces yeasts. The results revealed pathways with signatures of constrained and accelerated regulatory evolution in individual yeasts and across the phylogeny, highlighting the prevalence of pathway-level expression change during the divergence of yeast species. We anticipate that our pathway-based phylogenetic approach will be of broad utility in the search to understand the evolutionary relevance of regulatory change.
Monitor lizards are unique among ectothermic reptiles in that they have high aerobic capacity and distinctive cardiovascular physiology resembling that of endothermic mammals. Here, we sequence the genome of the Komodo dragon Varanus komodoensis, the largest extant monitor lizard, and generate a high-resolution de novo chromosome-assigned genome assembly for V. komodoensis using a hybrid approach of long-range sequencing and single-molecule optical mapping. Comparing the genome of V. komodoensis with those of related species, we find evidence of positive selection in pathways related to energy metabolism, cardiovascular homoeostasis, and haemostasis. We also show species-specific expansions of a chemoreceptor gene family related to pheromone and kairomone sensing in V. komodoensis and other lizard lineages. Together, these evolutionary signatures of adaptation reveal the genetic underpinnings of the unique Komodo dragon sensory and cardiovascular systems, and suggest that selective pressure altered haemostasis genes to help Komodo dragons evade the anticoagulant effects of their own saliva. The Komodo dragon genome is an important resource for understanding the biology of monitor lizards and reptiles worldwide.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.