The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
Autoimmune polyglandular syndrome type I (APS 1, also called APECED) is an autosomal-recessive disorder that maps to human chromosome 21q22.3 between markers D21S49 and D21S171 by linkage studies. We have isolated a novel gene from this region, AIRE (autoimmune regulator), which encodes a protein containing motifs suggestive of a transcription factor including two zinc-finger (PHD-finger) motifs, a proline-rich region and three LXXLL motifs. Two mutations, a C-->T substitution that changes the Arg 257 (CGA) to a stop codon (TGA) and an A-->G substitution that changes the Lys 83 (AAG) to a Glu codon (GAG), were found in this novel gene in Swiss and Finnish APECED patients. The Arg257stop (R257X) is the predominant mutation in Finnish APECED patients, accounting for 10/12 alleles studied. These results indicate that this gene is responsible for the pathogenesis of APECED. The identification of the gene defective in APECED should facilitate the genetic diagnosis and potential treatment of the disease and further enhance our general understanding of the mechanisms underlying autoimmune diseases.
To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences.
Knowledge of the complete genomic DNA sequence of an organism allows a systematic approach to defining its genetic components. The genomic sequence provides access to the complete structures of all genes, including those without known function, their control elements, and, by inference, the proteins they encode, as well as all other biologically important sequences. Furthermore, the sequence is a rich and permanent source of information for the design of further biological studies of the organism and for the study of evolution through cross-species sequence comparison. The power of this approach has been amply demonstrated by the determination of the sequences of a number of microbial and model organisms. The next step is to obtain the complete sequence of the entire human genome. Here we report the sequence of the euchromatic part of human chromosome 22. The sequence obtained consists of 12 contiguous segments spanning 33.4 megabases, contains at least 545 genes and 134 pseudogenes, and provides the first view of the complex chromosomal landscapes that will be found in the rest of the genome.
Chromosome 21 is the smallest human autosome. An extra copy of chromosome 21 causes Down syndrome, the most frequent genetic cause of significant mental retardation, which affects up to 1 in 700 live births. Several anonymous loci for monogenic disorders and predispositions for common complex disorders have also been mapped to this chromosome, and loss of heterozygosity has been observed in regions associated with solid tumours. Here we report the sequence and gene catalogue of the long arm of chromosome 21. We have sequenced 33,546,361 base pairs (bp) of DNA with very high accuracy, the largest contig being 25,491,867 bp. Only three small clone gaps and seven sequencing gaps remain, comprising about 100 kilobases. Thus, we achieved 99.7% coverage of 21q. We also sequenced 281,116 bp from the short arm. The structural features identified include duplications that are probably involved in chromosomal abnormalities and repeat structures in the telomeric and pericentromeric regions. Analysis of the chromosome revealed 127 known genes, 98 predicted genes and 59 pseudogenes.
Gene duplication creates evolutionary novelties by using older tools in new ways. We have identified evidence that the genes for enamel matrix proteins (EMPs), milk caseins, and salivary proteins comprise a family descended from a common ancestor by tandem gene duplication. These genes remain linked, except for one EMP gene, amelogenin. These genes show common structural features and are expressed in ontogenetically similar tissues. Many of these genes encode secretory Ca-binding phosphoproteins, which regulate the Ca-phosphate concentration of the extracellular environment. By exploiting this fundamental property, these genes have subsequently diversified to serve specialized adaptive functions. Casein makes milk supersaturated with Ca-phosphate, which was critical to the successive mammalian divergence. The innovation of enamel led to mineralized feeding apparatus, which enabled active predation of early vertebrates. The EMP genes comprise a subfamily not identified previously. A set of genes for dentine and bone extracellular matrix proteins constitutes an additional cluster distal to the EMP gene cluster, with similar structural features to EMP genes. The duplication and diversification of the primordial genes for enamel͞dentine͞bone extracellular matrix may have been important in core vertebrate feeding adaptations, the mineralized skeleton, the evolution of saliva, and, eventually, lactation. The order of duplication events may help delineate early events in mineralized skeletal formation, which is a major characteristic of vertebrates.
Approximately 50% of childhood deafness is caused by mutations in specific genes. Autosomal recessive loci account for approximately 80% of nonsyndromic genetic deafness 1 . Here we report the identification of a new transmembrane serine protease (TMPRSS3; also known as ECHOS1) expressed in many tissues, including fetal cochlea, which is mutated in the families used to describe both the DFNB10 and DFNB8 loci. An 8-bp deletion and insertion of 18 monomeric (∼68-bp) β-satellite repeat units, normally present in tandem arrays of up to several hundred kilobases on the short arms of acrocentric chromosomes, causes congenital deafness (DFNB10). A mutation in a spliceacceptor site, resulting in a 4-bp insertion in the mRNA and a frameshift, was detected in childhood onset deafness (DFNB8). This is the first description of β-satellite insertion into an active gene resulting in a pathogenic state, and the first description of a protease involved in hearing loss.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.