Studies of the human microbiome have revealed that even healthy individuals differ remarkably in the microbes that occupy habitats such as the gut, skin, and vagina. Much of this diversity remains unexplained, although diet, environment, host genetics, and early microbial exposure have all been implicated. Accordingly, to characterize the ecology of human-associated microbial communities, the Human Microbiome Project has analyzed the largest cohort and set of distinct, clinically relevant body habitats to date. We found the diversity and abundance of each habitat’s signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals. The project encountered an estimated 81–99% of the genera, enzyme families, and community configurations occupied by the healthy Western microbiome. Metagenomic carriage of metabolic pathways was stable among individuals despite variation in community structure, and ethnic/racial background proved to be one of the strongest associations of both pathways and microbes with clinical metadata. These results thus delineate the range of structural and functional configurations normal in the microbial communities of a healthy population, enabling future characterization of the epidemiology, ecology, and translational applications of the human microbiome.
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
Whole-genome duplication (WGD), or polyploidy, followed by gene loss and diploidization has long been recognized as an important evolutionary force in animals, fungi and other organisms, especially plants. The success of angiosperms has been attributed, in part, to innovations associated with gene or whole-genome duplications, but evidence for proposed ancient genome duplications pre-dating the divergence of monocots and eudicots remains equivocal in analyses of conserved gene order. Here we use comprehensive phylogenomic analyses of sequenced plant genomes and more than 12.6 million new expressed-sequence-tag sequences from phylogenetically pivotal lineages to elucidate two groups of ancient gene duplications-one in the common ancestor of extant seed plants and the other in the common ancestor of extant angiosperms. Gene duplication events were intensely concentrated around 319 and 192 million years ago, implicating two WGDs in ancestral lineages shortly before the diversification of extant seed plants and extant angiosperms, respectively. Significantly, these ancestral WGDs resulted in the diversification of regulatory genes important to seed and flower development, suggesting that they were involved in major innovations that ultimately contributed to the rise and eventual dominance of seed plants and angiosperms.
this diversity is located in discrete gene clusters that are spread throughout the different genomes. In contrast to this diversity, these enteric microorganisms exhibit marked synteny in their largescale genomic organization, bearing in mind that E. coli and S. enterica diverged about 100 Myr ago 28. The conserved genes may be a re¯ection of the basic lifestyle of the bacteria, requiring intestine colonization, environmental survival and transmission. The unique gene clusters probably contribute to adaptation to environmental niches and to pathogenicity. The pseudogene complement of S. typhi has implications for our understanding of the tight host restriction of this organism, and raises the question of whether it may be possible to eradicate S. typhi and typhoid fever altogether. M Methods Salmonella typhi CT18 was isolated in December 1993, at the Mekong Delta region of Vietnam, from a 9-year-old girl who was suffering from typhoid. The strain was isolated from blood using routine culture methods 23 , and after serological and metabolic con-®rmation of the strain as S. typhi it was immediately frozen in glycerol at-70 8C. The genome sequence was obtained from 97,000 end sequences (giving 7.9´coverage) derived from several pUC18 genomic shotgun libraries (with insert sizes ranging from 1.4 to 4.0 kb) using dye terminator chemistry on ABI377 automated sequencers. This was supplemented with 0.7´sequence coverage from M13mp18 libraries with similar insert sizes. End sequences from a larger insert plasmid (pSP64; 1.9´clone coverage, 10±14-kb insert size) and lambda (lambda-FIX-II; 0.4´clone coverage, 20±22-kb insert size) libraries were used as a scaffold, and the ®nal assembly was veri®ed by comparison with restriction-enzyme digest patterns using pulsed-®eld gel electrophoresis (data not shown). Total sequence coverage was 9.1´. The sequence was assembled, ®nished and annotated as described 29 , using Artemis 30 to collate data and facilitate annotation. In addition we used a gene®nder that was trained speci®cally for S. typhi, which uses a hidden Markov model with modules for the coding region, start and stop codons, and the ribosome-binding site (T.S.L. and A.K., unpublished data). The genome and proteome sequences of S. typhi and S. typhimurium or E. coli were compared in parallel to identify deletions and insertions using the Artemis Comparison Tool (ACT) (K. Rutherford, unpublished data; see also http://www.sanger.ac.uk/Software/ ACT/). Pseudogenes had one or more mutations that would ablate expression, and were identi®ed by direct comparison with S. typhimurium; each of the inactivating mutations was subsequently checked against the original sequencing data.
A variety of microbial communities and their genes (microbiome) exist throughout the human body, playing fundamental roles in human health and disease. The NIH funded Human Microbiome Project (HMP) Consortium has established a population-scale framework which catalyzed significant development of metagenomic protocols resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 to 18 body sites up to three times, which to date, have generated 5,177 microbial taxonomic profiles from 16S rRNA genes and over 3.5 Tb of metagenomic sequence. In parallel, approximately 800 human-associated reference genomes have been sequenced. Collectively, these data represent the largest resource to date describing the abundance and variety of the human microbiome, while providing a platform for current and future studies.
Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.
The 1,852,442-bp sequence of an M1 strain of Streptococcus pyogenes, a Gram-positive pathogen, has been determined and contains 1,752 predicted protein-encoding genes. Approximately onethird of these genes have no identifiable function, with the remainder falling into previously characterized categories of known microbial function. Consistent with the observation that S. pyogenes is responsible for a wider variety of human disease than any other bacterial species, more than 40 putative virulenceassociated genes have been identified. Additional genes have been identified that encode proteins likely associated with microbial ''molecular mimicry'' of host characteristics and involved in rheumatic fever or acute glomerulonephritis. The complete or partial sequence of four different bacteriophage genomes is also present, with each containing genes for one or more previously undiscovered superantigen-like proteins. These prophage-associated genes encode at least six potential virulence factors, emphasizing the importance of bacteriophages in horizontal gene transfer and a possible mechanism for generating new strains with increased pathogenic potential.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.