Natalia Maltsev scite author profile

Previously, we presented evidence that it is possible to predict functional coupling between genes based on conservation of gene clusters between genomes. With the rapid increase in the availability of prokaryotic sequence data, it has become possible to verify and apply the technique. In this paper, we extend our characterization of the parameters that determine the utility of the approach, and we generalize the approach in a way that supports detection of common classes of functionally coupled genes (e.g., transport and signal transduction clusters). Now that the analysis includes over 30 complete or nearly complete genomes, it has become clear that this approach will play a significant role in supporting efforts to assign functionality to the remaining uncharacterized genes in sequenced genomes.Gene clusters are known to be prominent features of bacterial chromosomes. Demerec and Hartman (1) postulated in 1959 that ''regardless of how the gene clusters originated, natural selection must act to prevent their separation'' and the ''mere existence of such arrangements shows that they must be beneficial, conferring an evolutionary advantage on individuals and populations which exhibit them.'' One of the most striking features of prokaryotic gene clusters is that typically they are composed of functionally related genes. For the past 40 years, there has been vigorous, ongoing discussion on the functional significance of gene arrangement on the chromosome, as well as the origin and mechanisms of maintenance of gene clusters (see, for example, refs. 2-5).Here, we present a method that uses conserved gene clusters from a large number of genomes to predict functional coupling between genes in those genomes. This article further develops the approach that we previously reported (6) and uses this method to reconstruct several major metabolic and functional subsystems. MethodologyThe data presented below are computed via the WIT system (http:͞͞wit.mcs.anl.gov͞WIT2͞), developed by Overbeek et al. (7) at Argonne National Laboratory. WIT was designed and implemented to support genetic sequence analysis, metabolic reconstructions, and comparative analysis of sequenced genomes; it currently contains data from over 30 genomes, albeit a few of them are incomplete.Our approach to detection of conserved clusters of genes is based on the following definitions: a set of genes occurring on a prokaryotic chromosome will be called a ''run'' if and only if they all occur on the same strand and the gaps between adjacent genes are 300 bp or less. Any pair of genes occurring within a single run is called ''close.'' Given two genes X a and X b from two genomes G a and G b , X a and X b are called a ''bidirectional best hit (BBH)'' if and only if recognizable similarity exists between them (in our case, we required FASTA3 scores lower than 1.0 ϫ 10 Ϫ5 ), there is no gene Z b in G b that is more similar than X b is to X a , and there is no gene Z a in G a that is more similar than X a is to Computation of PCBBHs for 31 complete or ne...

show abstract

The minimum information about a genome sequence (MIGS) specification

Field

et al. 2008

View full text Add to dashboard Cite

The BioPAX community standard for pathway data sharing

Demir¹,

Cary²,

Paley³

et al. 2010

Nat Biotechnol

632

524

View full text Add to dashboard Cite

BioPAX (Biological Pathway Exchange) is a standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data (http://www.biopax.org). Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery.

show abstract

WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction

Overbeek

Larsen²,

Pusch³

et al. 2000

337

193

View full text Add to dashboard Cite

The WIT (What Is There) (http://wit.mcs.anl.gov/WIT2/) system has been designed to support comparative analysis of sequenced genomes and to generate metabolic reconstructions based on chromosomal sequences and metabolic modules from the EMP/MPW family of databases. This system contains data derived from about 40 completed or nearly completed genomes. Sequence homologies, various ORF-clustering algorithms, relative gene positions on the chromosome and placement of gene products in metabolic pathways (metabolic reconstruction) can be used for the assignment of gene functions and for development of overviews of genomes within WIT. The integration of a large number of phylogenetically diverse genomes in WIT facilitates the understanding of the physiology of different organisms.

show abstract

Identification ofFrancisella tularensis Himar1-Based Transposon Mutants Defective for Replication in Macrophages

et al. 2007

View full text Add to dashboard Cite

Francisella tularensis, the etiologic agent of tularemia in humans, is a potential biological threat due to its low infectious dose and multiple routes of entry. F. tularensis replicates within several cell types, eventually causing cell death by inducing apoptosis. In this study, a modified Himar1 transposon (HimarFT) was used to mutagenize F. tularensis LVS. Approximately 7,000 Km r clones were screened using J774A.1 macrophages for reduction in cytopathogenicity based on retention of the cell monolayer. A total of 441 candidates with significant host cell retention compared to the parent were identified following screening in a high-throughput format. Retesting at a defined multiplicity of infection followed by in vitro growth analyses resulted in identification of approximately 70 candidates representing 26 unique loci involved in macrophage replication and/or cytotoxicity. Mutants carrying insertions in seven hypothetical genes were screened in a mouse model of infection, and all strains tested appeared to be attenuated, which validated the initial in vitro results obtained with cultured macrophages. Complementation and reverse transcription-PCR experiments suggested that the expression of genes adjacent to the HimarFT insertion may be affected depending on the orientation of the constitutive groEL promoter region used to ensure transcription of the selective marker in the transposon. A hypothetical gene, FTL_0706, postulated to be important for lipopolysaccharide biosynthesis, was confirmed to be a gene involved in O-antigen expression in F. tularensis LVS and Schu S4. These and other studies demonstrate that therapeutic targets, vaccine candidates, or virulence-related genes may be discovered utilizing classical genetic approaches in Francisella.Francisella tularensis is a gram-negative intracellular pathogen and the etiologic agent of human tularemia. The CDC has classified F. tularensis as a category A select agent due to its highly infectious nature and ease of dissemination. Four subspecies of F. tularensis have been recognized, including (i) the virulent type A F. tularensis subsp. tularensis, (ii) the less virulent type B F. tularensis subsp. holarctica, (iii) F. tularensis subsp. mediasiatica, and (iv) F. tularensis subsp. novicida. The F. tularensis LVS (live vaccine strain) is derived from F. tularensis subsp. holarctica and is used as a model system to identify Francisella virulence factors since it is attenuated in humans but virulent in mice (8, 21). The limited genetic variation (2 to 4%) between the subspecies of Francisella suggests that there is potential overlap among genes related to pathogenesis (7,54,59). In fact, F. tularensis LVS and Schu S4 vary in genomic sequence by less than 1% (59). Regardless of the high sequence similarity at the genomic level, genome rearrangement and variation at the functional or regulatory level among the subspecies clearly result in phenotypes that impact virulence and pathogenesis (12,15,29,54,73,74).The life cycle of F. tularensis inside the macrophage ...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Natalia Maltsev

The use of gene clusters to infer functional coupling

The minimum information about a genome sequence (MIGS) specification

The BioPAX community standard for pathway data sharing

WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction

Identification ofFrancisella tularensis Himar1-Based Transposon Mutants Defective for Replication in Macrophages

Contact Info

Product

Resources

About