The first chordates appear in the fossil record at the time of the Cambrian explosion, nearly 550 million years ago. The modern ascidian tadpole represents a plausible approximation to these ancestral chordates. To illuminate the origins of chordate and vertebrates, we generated a draft of the protein-coding portion of the genome of the most studied ascidian, Ciona intestinalis. The Ciona genome contains ϳ16,000 protein-coding genes, similar to the number in other invertebrates, but only half that found in vertebrates. Vertebrate gene families are typically found in simplified form in Ciona, suggesting that ascidians contain the basic ancestral complement of genes involved in cell signaling and development. The ascidian genome has also acquired a number of lineage-specific innovations, including a group of genes engaged in cellulose metabolism that are related to those in bacteria and fungi.
Teleosts comprise more than half of all vertebrate species and have adapted to a variety of marine and freshwater habitats 1 . Their genome evolution and diversification are important subjects for the understanding of vertebrate evolution. Although draft genome sequences of two pufferfishes have been published 2,3 , analysis of more fish genomes is desirable. Here we report a high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka (Oryzias latipes). Medaka is native to East Asia and an excellent model system for a wide range of biology, including ecotoxicology, carcinogenesis, sex determination 4-6 and developmental genetics 7 . In the assembled medaka genome (700 megabases), which is less than half of the zebrafish genome, we predicted 20,141 genes, including 2,900 new genes, using 59-end serial analysis of gene expression tag information. We found single nucleotide polymorphisms (SNPs) at an average rate of 3.42% between the two inbred strains derived from two regional populations; this is the highest SNP rate seen in any vertebrate species. Analyses based on the dense SNP information show a strict genetic separation of 4 million years (Myr) between the two populations, and suggest that differential selective pressures acted on specific gene categories. Four-way comparisons with the human, pufferfish (Tetraodon), zebrafish and medaka genomes revealed that eight major interchromosomal rearrangements took place in a remarkably short period of 50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300 Myr.We applied the whole-genome shotgun approach to an inbred strain, , derived from the southern Japanese population, as the main target. A total of 13.8 million reads amounting to approximately 10.6-fold genome coverage were obtained from the shotgun plasmid, fosmid and bacterial artificial chromosome (BAC) libraries. A newly developed RAMEN assembler was used to process the shotgun reads to generate contigs and scaffolds. The N50 values (50% of nucleotides in an assembly are in scaffolds-or contigs-longer than or equal to the N50 value) are ,1.41 megabases (Mb) for scaffolds and ,9.8 kilobases (Kb) for contigs. The total length of the contigs reached 700.4 Mb, which, from now on, we refer to as the medaka genome size.To construct ultracontigs, the scaffolds were integrated with the medaka genetic map by using SNP markers. For this purpose, we further obtained about 2.8-fold coverage of shotgun reads from another inbred strain HNI (refs 9, 10), which is derived from the northern Japanese population. The reads were assembled by RAMEN to scaffolds covering 648 Mb. Aligning the HNI contigs with the HdrR genome using BLASTZ 11 , we identified 16.4 million SNPs as well as 1.40 million insertions and 1.45 million deletions in non-repetitive regions (Supplementary Table 2). We selected 2,401 SNPs and genetically mapped them onto medaka chromosomes using a backcross panel between the...
Medaka (Oryzias latipes) is a small egg-laying freshwater teleost native to East Asia that has become an excellent model system for developmental genetics and evolutionary biology. The draft medaka genome sequence (700 Mb) was reported in June 2007, and its substantial genomic resources have been opened to the public through the University of Tokyo Genome Browser Medaka (UTGB/medaka) database. This database provides basic genomic information, such as predicted genes, expressed sequence tags (ESTs), guanine/cytosine (GC) content, repeats and comparative genomics, as well as unique data resources including (i) 2473 genetic markers and experimentally confirmed PCR primers that amplify these markers, (ii) 142 414 bacterial artificial chromosome (BAC) and 217 344 fosmid end sequences that amount to 15.0- and 11.1-fold clone coverage of the entire genome, respectively, and were used for draft genome assembly, (iii) 16 519 460 single nucleotide polymorphisms (SNPs), and 2 859 905 insertions/deletions detected between two medaka inbred strain genomes and (iv) 841 235 5′-end serial analyses of gene-expression (SAGE) tags that identified 344 266 transcription start sites on the genome. UTGB/medaka is available at: http://medaka.utgenome.org/
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.