DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally employed long (400–800 bp) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intra-species genetic variation. We report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterise four million SNPs and four hundred thousand structural variants, many of which are previously unknown. Our approach is effective for accurate, rapid and economical whole genome re-sequencing and many other biomedical applications.
We have sequenced and annotated the genome of ®ssion yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly re¯ecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have signi®cant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identi®ed, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.We report here the completion of the fully annotated genome sequence of the simple eukaryote Schizosaccharomyces pombe, a ®ssion yeast. It becomes the sixth eukaryotic genome to be sequenced, following Saccharomyces cerevisiae 1 , Caenorhabditis elegans 2 , Drosophila melanogaster 3 , Arabidopsis thaliana 4 and Homo sapiens 5,6 . The entire sequence of the unique regions of the three chromosomes is complete, with gaps in the centromeric regions of about 40 kb, and about 260 kb in the telomeric regions. The completion of this sequence, the availability of sophisticated research methodologies, and the expanding community working on S. pombe, will accelerate the use of S. pombe for functional and comparative studies of eukaryotic cell processes.
Leprosy, a chronic human neurological disease, results from infection with the obligate intracellular pathogen Mycobacterium leprae, a close relative of the tubercle bacillus. Mycobacterium leprae has the longest doubling time of all known bacteria and has thwarted every effort at culture in the laboratory. Comparing the 3.27-megabase (Mb) genome sequence of an armadillo-derived Indian isolate of the leprosy bacillus with that of Mycobacterium tuberculosis (4.41 Mb) provides clear explanations for these properties and reveals an extreme case of reductive evolution. Less than half of the genome contains functional genes but pseudogenes, with intact counterparts in M. tuberculosis, abound. Genome downsizing and the current mosaic arrangement appear to have resulted from extensive recombination events between dispersed repetitive sequences. Gene deletion and decay have eliminated many important metabolic activities including siderophore production, part of the oxidative and most of the microaerophilic and anaerobic respiratory chains, and numerous catabolic systems and their regulatory circuits.
Salmonella enterica serovar Typhi (S. typhi) is the aetiological agent of typhoid fever, a serious invasive bacterial disease of humans with an annual global burden of approximately 16 million cases, leading to 600,000 fatalities. Many S. enterica serovars actively invade the mucosal surface of the intestine but are normally contained in healthy individuals by the local immune defence mechanisms. However, S. typhi has evolved the ability to spread to the deeper tissues of humans, including liver, spleen and bone marrow. Here we have sequenced the 4,809,037-base pair (bp) genome of a S. typhi (CT18) that is resistant to multiple drugs, revealing the presence of hundreds of insertions and deletions compared with the Escherichia coli genome, ranging in size from single genes to large islands. Notably, the genome sequence identifies over two hundred pseudogenes, several corresponding to genes that are known to contribute to virulence in Salmonella typhimurium. This genetic degradation may contribute to the human-restricted host range for S. typhi. CT18 harbours a 218,150-bp multiple-drug-resistance incH1 plasmid (pHCM1), and a 106,516-bp cryptic plasmid (pHCM2), which shows recent common ancestry with a virulence plasmid of Yersinia pestis.
We determined the complete genome sequence of Clostridium difficile strain 630, a virulent and multidrug-resistant strain. Our analysis indicates that a large proportion (11%) of the genome consists of mobile genetic elements, mainly in the form of conjugative transposons. These mobile elements are putatively responsible for the acquisition by C. difficile of an extensive array of genes involved in antimicrobial resistance, virulence, host interaction and the production of surface structures. The metabolic capabilities encoded in the genome show multiple adaptations for survival and growth within the gut environment. The extreme genome variability was confirmed by whole-genome microarray analysis; it may reflect the organism's niche in the gut and should provide information on the evolution of virulence in this organism.
Burkholderia pseudomallei is a recognized biothreat agent and the causative agent of melioidosis. This Gram-negative bacterium exists as a soil saprophyte in melioidosis-endemic areas of the world and accounts for 20% of community-acquired septicaemias in northeastern Thailand where half of those affected die. Here we report the complete genome of B. pseudomallei, which is composed of two chromosomes of 4.07 megabase pairs and 3.17 megabase pairs, showing significant functional partitioning of genes between them. The large chromosome encodes many of the core functions associated with central metabolism and cell growth, whereas the small chromosome carries more accessory functions associated with adaptation and survival in different niches. Genomic comparisons with closely and more distantly related bacteria revealed a greater level of gene order conservation and a greater number of orthologous genes on the large chromosome, suggesting that the two replicons have distinct evolutionary origins. A striking feature of the genome was the presence of 16 genomic islands (GIs) that together made up 6.1% of the genome. Further analysis revealed these islands to be variably present in a collection of invasive and soil isolates but entirely absent from the clonally related organism B. mallei. We propose that variable horizontal gene acquisition by B. pseudomallei is an important feature of recent genetic evolution and that this has resulted in a genetically diverse pathogenic species.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.