The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
The first chordates appear in the fossil record at the time of the Cambrian explosion, nearly 550 million years ago. The modern ascidian tadpole represents a plausible approximation to these ancestral chordates. To illuminate the origins of chordate and vertebrates, we generated a draft of the protein-coding portion of the genome of the most studied ascidian, Ciona intestinalis. The Ciona genome contains ϳ16,000 protein-coding genes, similar to the number in other invertebrates, but only half that found in vertebrates. Vertebrate gene families are typically found in simplified form in Ciona, suggesting that ascidians contain the basic ancestral complement of genes involved in cell signaling and development. The ascidian genome has also acquired a number of lineage-specific innovations, including a group of genes engaged in cellulose metabolism that are related to those in bacteria and fungi.
Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacteria encode a broad repertoire of transporters for efficient carbon and nitrogen acquisition from the nutritionally rich environments they inhabit and reflect a limited range of biosynthetic capabilities that indicate both prototrophic and auxotrophic strains. Phylogenetic analyses, comparison of gene content across the group, and reconstruction of ancestral gene sets indicate a combination of extensive gene loss and key gene acquisitions via horizontal gene transfer during the coevolution of lactic acid bacteria with their habitats. evolutionary genomics ͉ fermentation L actic acid bacteria (LAB) are historically defined as a group of microaerophilic, Gram-positive organisms that ferment hexose sugars to produce primarily lactic acid. This functional classification includes a variety of industrially important genera, including Lactococcus, Enterococcus, Oenococcus, Pediococcus, Streptococcus, Leuconostoc, and Lactobacillus species. The seemingly simplistic metabolism of LAB has been exploited throughout history for the preservation of foods and beverages in nearly all societies dating back to the origins of agriculture (1). Domestication of LAB strains passed down through various culinary traditions and continuous passage on food stuffs has resulted in modern-day cultures able to carry out these fermentations. Today, LAB play a prominent role in the world food supply, performing the main bioconversions in fermented dairy products, meats, and vegetables. LAB also are critical for the production of wine, coffee, silage, cocoa, sourdough, and numerous indigenous food fermentations (2).LAB species are indigenous to food-related habitats, including plant (fruits, vegetables, and cereal grains) and milk environments. In addition, LAB are naturally associated with the mucosal surfaces of animals, e.g., small intestine, colon, and vagina. Isolates of the same species often are obtained from plant, dairy, and animal habitats, implying wide distribution and specialized adaptation to these diverse environments. LAB species employ two pathways to metabolize hexose: a homofermentative pathway in which lactic acid is the primary product and a heterofermentative pathway in which lactic acid, CO 2 , acetic acid, and͞or ethanol are produced (3).Complete genome sequences have been published for eight fermentative and commensal LAB species: Lactococcus lactis, Lactobacillus plantarum, Lactobacillus johnsonii, Lactobacillus acidophilus, Lactobacillus sakei, Lactobacillus bulgaricus, Lactobacillus salivarius, and Streptococcus thermophilus (4-11). This study examines nine other LAB genomes representing the phylogenetic and functional diversity of lactic acid-producing microorganisms. The LAB have small genomes encoding a range of biosynthe...
The compact genome of Fugu rubripes has been sequenced to over 95% coverage, and more than 80% of the assembly is in multigene-sized scaffolds. In this 365-megabase vertebrate genome, repetitive DNA accounts for less than one-sixth of the sequence, and gene loci occupy about one-third of the genome. As with the human genome, gene loci are not evenly distributed, but are clustered into sparse and dense regions. Some "giant" genes were observed that had average coding sequence sizes but were spread over genomic lengths significantly larger than those of their human orthologs. Although three-quarters of predicted human proteins have a strong match to Fugu, approximately a quarter of the human proteins had highly diverged from or had no pufferfish homologs, highlighting the extent of protein evolution in the 450 million years since teleosts and mammals diverged. Conserved linkages between Fugu and human genes indicate the preservation of chromosomal segments from the common vertebrate ancestor, but with considerable scrambling of gene order.
Tuberous sclerosis complex (TSC) is an autosomal dominant disorder characterized by the widespread development of distinctive tumors termed hamartomas. TSC-determining loci have been mapped to chromosomes 9q34 (TSC1) and 16p13 (TSC2). The TSC1 gene was identified from a 900-kilobase region containing at least 30 genes. The 8.6-kilobase TSC1 transcript is widely expressed and encodes a protein of 130 kilodaltons (hamartin) that has homology to a putative yeast protein of unknown function. Thirty-two distinct mutations were identified in TSC1, 30 of which were truncating, and a single mutation (2105delAAAG) was seen in six apparently unrelated patients. In one of these six, a somatic mutation in the wild-type allele was found in a TSC-associated renal carcinoma, which suggests that hamartin acts as a tumor suppressor.
As part of our effort to sequence the 100-megabase (Mb) genome of the nematode Caenorhabditis elegans, we have completed the nucleotide sequence of a contiguous 2,181,032 base pairs in the central gene cluster of chromosome III. Analysis of the finished sequence has indicated an average density of about one gene per five kilobases; comparison with the public sequence databases reveals similarities to previously known genes for about one gene in three. In addition, the genomic sequence contains several intriguing features, including putative gene duplications and a variety of other repeats with potential evolutionary implications.
The availability of dense genetic linkage maps of mammalian genomes makes feasible a wide range of studies, including positional cloning of monogenic traits, genetic dissection of polygenic traits, construction of genome-wide physical maps, rapid marker-assisted construction of congenic strains, and evolutionary comparisons. We have been engaged for the past five years in a concerted effort to produce a dense genetic map of the laboratory mouse. Here we present the final report of this project. The map contains 7,377 genetic markers, consisting of 6,580 highly informative simple sequence length polymorphisms integrated with 797 restriction fragment length polymorphisms in mouse genes. The average spacing between markers is about 0.2 centimorgans or 400 kilobases.
A physical map has been constructed of the human genome containing 15,086 sequence-tagged sites (STSs), with an average spacing of 199 kilobases. The project involved assembly of a radiation hybrid map of the human genome containing 6193 loci and incorporated a genetic linkage map of the human genome containing 5264 loci. This information was combined with the results of STS-content screening of 10,850 loci against a yeast artificial chromosome library to produce an integrated map, anchored by the radiation hybrid and genetic maps. The map provides radiation hybrid coverage of 99 percent and physical coverage of 94 percent of the human genome. The map also represents an early step in an international project to generate a transcript map of the human genome, with more than 3235 expressed sequences localized. The STSs in the map provide a scaffold for initiating large-scale sequencing of the human genome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.