A conceptual framework for integrating diverse functional genomics data was developed by reinterpreting experiments to provide numerical likelihoods that genes are functionally linked. This allows direct comparison and integration of different classes of data. The resulting probabilistic gene network estimates the functional coupling between genes. Within this framework, we reconstructed an extensive, high-quality functional gene network for Saccharomyces cerevisiae, consisting of 4681 (approximately 81%) of the known yeast genes linked by approximately 34,000 probabilistic linkages comparable in accuracy to small-scale interaction assays. The integrated linkages distinguish true from false-positive interactions in earlier data sets; new interactions emerge from genes' network contexts, as shown for genes in chromatin modification and ribosome biogenesis.
We report the complete sequence of the 4,274,642-bp genome of Haloarcula marismortui, a halophilic archaeal isolate from the Dead Sea. The genome is organized into nine circular replicons of varying G+C compositions ranging from 54% to 62%. Comparison of the genome architectures of Halobacterium sp. NRC-1 and H. marismortui suggests a common ancestor for the two organisms and a genome of significantly reduced size in the former. Both of these halophilic archaea use the same strategy of high surface negative charge of folded proteins as means to circumvent the salting-out phenomenon in a hypersaline cytoplasm. A multitiered annotation approach, including primary sequence similarities, protein family signatures, structure prediction, and a protein function association network, has assigned putative functions for at least 58% of the 4242 predicted proteins, a far larger number than is usually achieved in most newly sequenced microorganisms. Among these assigned functions were genes encoding six opsins, 19 MCP and/or HAMP domain signal transducers, and an unusually large number of environmental response regulators-nearly five times as many as those encoded in Halobacterium sp. NRC-1-suggesting H. marismortui is significantly more physiologically capable of exploiting diverse environments. In comparing the physiologies of the two halophilic archaea, in addition to the expected extensive similarity, we discovered several differences in their metabolic strategies and physiological responses such as distinct pathways for arginine breakdown in each halophile. Finally, as expected from the larger genome, H. marismortui encodes many more functions and seems to have fewer nutritional requirements for survival than does Halobacterium sp. NRC-1.
We introduce a general computational method, applicable on a genome-wide scale, for the systematic discovery of uncharacterized cellular systems. Quantitative analysis of the coinheritance of pairs of genes among different organisms, calculated using phylogenetic profiles, allows the prediction of thousands of functional linkages between the corresponding proteins. A comparison of these functional linkages to known pathways reveals that calculated linkages are comparable in accuracy to genome-wide yeast two-hybrid screens or mass spectrometry interaction assays. In aggregate, these linkages describe the structure of large-scale networks, with the resulting yeast network composed of 3,875 linkages among 804 proteins, and the resulting pathogenic Escherichia coli network composed of 2,043 linkages among 828 proteins. The search of such networks for groups of uncharacterized, linked proteins led to the identification of 27 novel cellular systems from one nonpathogenic and three pathogenic bacterial genomes.
Hepatitis C virus (HCV) infection is a major cause of liver disease and hepatocellular carcinoma. Glycan shielding has been proposed to be a mechanism by which HCV masks broadly neutralizing epitopes on its viral glycoproteins. However, the role of altered glycosylation in HCV resistance to broadly neutralizing antibodies is not fully understood. Here, we have generated potent HCV neutralizing antibodies hu5B3.v3 and MRCT10.v362 that, similar to the previously described AP33 and HCV1, bind to a highly conserved linear epitope on E2. We utilize a combination of in vitro resistance selections using the cell culture infectious HCV and structural analyses to identify mechanisms of HCV resistance to hu5B3.v3 and MRCT10.v362. Ultra deep sequencing from in vitro HCV resistance selection studies identified resistance mutations at asparagine N417 (N417S, N417T and N417G) as early as 5days post treatment. Comparison of the glycosylation status of soluble versions of the E2 glycoprotein containing the respective resistance mutations revealed a glycosylation shift from N417 to N415 in the N417S and N417T E2 proteins. The N417G E2 variant was glycosylated neither at residue 415 nor at residue 417 and remained sensitive to MRCT10.v362. Structural analyses of the E2 epitope bound to hu5B3.v3 Fab and MRCT10.v362 Fab using X-ray crystallography confirmed that residue N415 is buried within the antibody-peptide interface. Thus, in addition to previously described mutations at N415 that abrogate the β-hairpin structure of this E2 linear epitope, we identify a second escape mechanism, termed glycan shifting, that decreases the efficacy of broadly neutralizing HCV antibodies.
Many thousands of proteins encoded by the genome of Plasmodium falciparum, the causal organism of the deadliest form of human malaria, are of unknown function. It is of utmost importance that these proteins be characterized if we are to develop combative strategies against malaria based on the biology of the parasite. In an attempt to infer protein function on a genome-wide scale, we computationally modeled the P. falciparum interactome, elucidating local and global functional relationships between gene products. The resulting interaction network, reconstructed by integrating in silico and experimental functional genomics data within a Bayesian framework, covers ∼68% of the parasite genome and provides functional inferences for more than 2000 uncharacterized proteins, based on their associations. Network reconstruction involved the use of a novel strategy, where we incorporated continuously updated, uniform reference priors in our Bayesian model. This method for generating interaction maps is thus also well suited for application to other genomes, where pre-existing interactome knowledge is sparse. Additionally, we superimposed this map on genomes of three apicomplexan pathogens-Plasmodium yoelii, Toxoplasma gondii, and Cryptosporidium parvum-describing relationships between these organisms based on retained functional linkages. This comparison provided a glimpse of the highly evolved nature of P. falciparum; for instance, a deficit of nearly 26% in terms of predicted interactions is observed against P. yoelii, because of missing ortholog partners in pairs of functionally linked proteins.[Supplemental material is available online at www.genome.org and results from this study are available for download from http://cbil.upenn.edu/plasmoMAP/.]The genome sequence of Plasmodium falciparum, the causative organism of the deadliest form of human malaria, has revealed many surprising details about the parasite, including the novel nature of its many genes. More than 60% of the genome is as yet uncharacterized; a majority of the genes bear no acceptable sequence homology with known genes in other organisms (Gardner et al. 2002). If we are to develop effective control strategies against malaria based on parasite biology, it is essential that we characterize these many unknown genes and their products and understand the interactions between them, both locally and on a genome-wide scale.Here we describe computational modeling of the Plasmodium falciparum interactome, which reveals local and genomewide functional relationships between proteins in the parasite genome and permits functional assignments based on associations between characterized and uncharacterized proteins. The interactome, as captured by a network of pairwise functional linkages, was reconstructed by integrating data from publicly available P. falciparum transcriptome profiling studies and linkages generated in silico within a Bayesian framework. Data integration for inferring protein associations proves advantageous for two well-known reasons-first, combinin...
Little is known about the expression of methicillin-resistant Staphylococcus aureus (MRSA) genes during infection conditions. Here, we described the transcriptome of the clinical MRSA strain USA300 derived from human cutaneous abscesses, and compared it with USA300 bacteria derived from infected kidneys in a mouse model. Remarkable similarity between the transcriptomes allowed us to identify genes encoding multiple proteases and toxins, and iron- and peptide-transporter molecules, which are upregulated in both infections and are likely important for establishment of infection. We also showed that disruption of the global transcriptional regulators agr and sae prevents in vivo upregulation of many toxins and proteases, protecting mice from lethal infection dose, and hinting at the role of these transcriptional regulators in the pathology of MRSA infection.
In this report we describe the genomic sequence of guinea pig cytomegalovirus (GPCMV) assembled from a tissue culture-derived bacterial artificial chromosome clone, plasmid clones of viral restriction fragments, and direct PCR sequencing of viral DNA. The GPCMV genome is 232,678 bp, excluding the terminal repeats, and has a GC content of 55%. A total of 105 open reading frames (ORFs) of > 100 amino acids with sequence and/or positional homology to other CMV ORFs were annotated. Positional and sequence homologs of human cytomegalovirus open reading frames UL23 through UL122 were identified. Homology with other cytomegaloviruses was most prominent in the central ~60% of the genome, with divergence of sequence and lack of conserved homologs at the respective genomic termini. Of interest, the GPCMV genome was found in many cases to bear stronger phylogenetic similarity to primate CMVs than to rodent CMVs. The sequence of GPCMV should facilitate vaccine and pathogenesis studies in this model of congenital CMV infection. FindingsGuinea pig cytomegalovirus (GPCMV) serves as a useful model of congenital infection, due to the ability of the virus to cross the placenta and infect the fetus in utero [1][2][3]. This model is well-suited to vaccine studies for prevention of congenital cytomegalovirus (CMV) infection, a major public health problem and a high-priority area for new vaccine development [4]. However, an impediment to studies in this model has been the lack of detailed DNA sequence data. Although a number of reports have identified specific gene products or clusters of genes [5][6][7][8][9][10][11], to date a full genomic sequence has not been available.We recently reported the construction and preliminary sequence map of a GPCMV bacterial artificial chromosome (BAC) clone maintained in E. coli [12,13], and this clone was used as an initial template for sequence analysis of the full GPCMV genome. BAC DNA was purified using Clontech's NucleoBond ® Plasmid Kits as described previously [14] and both strands were sequenced using an ABI PRISM ® 377 DNA Sequencer, with primers synthesized, as needed, to 'primer-walk' the nucleotide sequence. In parallel, Hind III-and EcoR I-digested fragments were gelpurified and cloned into pUC and pBR322-based vectors as previously described [15]. Plasmid sequences were
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.