Knowledge of toxins, virulence factors and antibiotic resistance genes is essential for bio-defense applications aimed at identifying ‘functional’ signatures for characterizing emerging or engineered pathogens. Whereas genetic signatures identify a pathogen, functional signatures identify what a pathogen is capable of. To facilitate rapid identification of sequences and characterization of genes for signature discovery, we have collected all publicly available (as of this writing), organized sequences representing known toxins, virulence factors, and antibiotic resistance genes in one convenient database, which we believe will be of use to the bio-defense research community. MvirDB integrates DNA and protein sequence information from Tox-Prot, SCORPION, the PRINTS virulence factors, VFDB, TVFac, Islander, ARGO and a subset of VIDA. Entries in MvirDB are hyperlinked back to their original sources. A blast tool allows the user to blast against all DNA or protein sequences in MvirDB, and a browser tool allows the user to search the database to retrieve virulence factor descriptions, sequences, and classifications, and to download sequences of interest. MvirDB has an automated weekly update mechanism. Each protein sequence in MvirDB is annotated using our fully automated protein annotation system and is linked to that system's browser tool. MvirDB can be accessed at .
To illuminate the function and evolutionary history of both genomes, we sequenced mouse DNA related to human chromosome 19. Comparative sequence alignments yielded confirmatory evidence for hypothetical genes and identified exons, regulatory elements, and candidate genes that were missed by other predictive methods. Chromosome-wide comparisons revealed a difference between single-copy HSA19 genes, which are overwhelmingly conserved in mouse, and genes residing in tandem familial clusters, which differ extensively in number, coding capacity, and organization between the two species. Finally, we sequenced breakpoints of all 15 evolutionary rearrangements, providing a view of the forces that drive chromosome evolution in mammals.
MotivationCurrently there are no tools specifically designed for annotating genes in phages. Several tools are available that have been adapted to run on phage genomes, but due to their underlying design, they are unable to capture the full complexity of phage genomes. Phages have adapted their genomes to be extremely compact, having adjacent genes that overlap and genes completely inside of other longer genes. This non-delineated genome structure makes it difficult for gene prediction using the currently available gene annotators. Here we present PHANOTATE, a novel method for gene calling specifically designed for phage genomes. Although the compact nature of genes in phages is a problem for current gene annotators, we exploit this property by treating a phage genome as a network of paths: where open reading frames are favorable, and overlaps and gaps are less favorable, but still possible. We represent this network of connections as a weighted graph, and use dynamic programing to find the optimal path.ResultsWe compare PHANOTATE to other gene callers by annotating a set of 2133 complete phage genomes from GenBank, using PHANOTATE and the three most popular gene callers. We found that the four programs agree on 82% of the total predicted genes, with PHANOTATE predicting more genes than the other three. We searched for these extra genes in both GenBank’s non-redundant protein database and all of the metagenomes in the sequence read archive, and found that they are present at levels that suggest that these are functional protein-coding genes.Availability and implementation https://github.com/deprekate/PHANOTATE Supplementary information Supplementary data are available at Bioinformatics online.
Rapid advances in the genomic sequencing of bacteria and viruses over the past few years have made it possible to consider sequencing the genomes of all pathogens that affect humans and the crops and livestock upon which our lives depend. Recent events make it imperative that full genome sequencing be accomplished as soon as possible for pathogens that could be used as weapons of mass destruction or disruption. This sequence information must be exploited to provide rapid and accurate diagnostics to identify pathogens and distinguish them from harmless near-neighbours and hoaxes. The Chem-Bio Non-Proliferation (CBNP) programme of the US Department of Energy (DOE) began a large-scale effort of pathogen detection in early 2000 when it was announced that the DOE would be providing bio-security at the 2002 Winter Olympic Games in Salt Lake City, Utah. Our team at the Lawrence Livermore National Lab (LLNL) was given the task of developing reliable and validated assays for a number of the most likely bioterrorist agents. The short timeline led us to devise a novel system that utilised whole-genome comparison methods to rapidly focus on parts of the pathogen genomes that had a high probability of being unique. Assays developed with this approach have been validated by the Centers for Disease Control (CDC). They were used at the 2002 Winter Olympics, have entered the public health system, and have been in continual use for non-publicised aspects of homeland defence since autumn 2001. Assays have been developed for all major threat list agents for which adequate genomic sequence is available, as well as for other pathogens requested by various government agencies. Collaborations with comparative genomics algorithm developers have enabled our LLNL team to make major advances in pathogen detection, since many of the existing tools simply did not scale well enough to be of practical use for this application. It is hoped that a discussion of a real-life practical application of comparative genomics algorithms may help spur algorithm developers to tackle some of the many remaining problems that need to be addressed. Solutions to these problems will advance a wide range of biological disciplines, only one of which is pathogen detection. For example, exploration in evolution and phylogenetics, annotating gene coding regions, predicting and understanding gene function and regulation, and untangling gene networks all rely on tools for aligning multiple sequences, detecting gene rearrangements and duplications, and visualising genomic data. Two key problems currently needing improved solutions are: (1) aligning incomplete, fragmentary sequence (eg draft genome contigs or arbitrary genome regions) with both complete genomes and other fragmentary sequences; and (2) ordering, aligning and visualising non-colinear gene rearrangements and inversions in addition to the colinear alignments handled by current tools.
Nucleotide sequence analysis of the genome of the baculovirus Bombyx mori nuclear polyhedrosis virus (BmNPV) identified 18 homologues of the Autographa californica NPV (AcNPV) lefs (late expression factor genes). These BmNPV lefs showed high (73-98%) amino acid sequence identities to AcNPV lefs and were localized to similar positions in the genome. One lef, p35, was previously characterized in AcNPV and BmNPV deletion experiments. Functional deletion of each of the BmNPV lef homologues was attempted here by insertion of a beta-galactosidase gene cassette into the coding region of each lef. Four of 18 BmNPV lef (39K, ie-2, lef-7, and p35) deletion mutants were successfully isolated, indicating that the other 14 BmNPV lefs were likely essential for viral replication in cell culture. Further analysis showed that deletion of lef-7, p35, and ie-2 resulted in lower levels of viral DNA replication, indicating that the BmNPV lef-7, p35, and ie-2 products have stimulating effects on DNA replication. Deletion of 39K resulted in a significantly lower level of late gene transcription and extremely low (over 10(2)-fold less at 48-80 hr p.i.) production of progeny budded virus in BmN cells. In contrast, the deletion did not affect viral DNA replication, indicating that BmNPV 39K is involved in late gene transcription. Reduced late gene expression presumably affected production and/or release of progeny budded virus particles. This was corroborated by transmission electron microscopy, which showed that virus replication was abnormal in BmN cells infected with a BmNPV mutant lacking 39K and virion production was low. Even though 39K deletion resulted in a loss of oral infectivity, the 39K deletion mutant replicated in silkworm larvae when injected into the body cavity, as did the ie-2, lef7, and p35 deletion mutants. In addition, a BmNPV homologue of the baculovirus very late expression factor gene (vif-1) found in AcNPV was essential, implying an essential function of the BmNPV vif-1 homologue at a step before the onset of very late gene expression.
Deuterium-labeled cocaine (cocaine-d5) was administered intravenously and/or intranasally in doses of 0.6-4.2 mg/kg to 25 human volunteers under laboratory clinical conditions. Sequential blood samples were collected for up to 3 days, and hair samples were collected for up to 10 months. Samples were analyzed by gas chromatography-mass spectrometry (GC-MS) for cocaine-d5 and its major metabolite, benzoylecgonine-d5 (BZE-d5). The parent drug, cocaine-d5, was the predominant analyte in hair, whereas BZE-d5 was the major analyte in blood, especially at later time periods. The amount of cocaine-d5 incorporated into hair ranged from 0.1 to 5 ng/mg hair, whereas the amount of BZE-d5 was approximately one-sixth of that concentration. The threshold dose for detection was estimated to be 25-35 mg of drug administered intravenously. A single dose could be detected for 2-6 months. Subjects receiving the same dose differed (from two to 12 times as much depending upon how it was measured) in the amount of cocaine-d5 incorporated into their hair. Non-Caucasians, in particular, incorporated more cocaine-d5 in hair than did Caucasians. Also, segmental analysis of the samples revealed considerable intersubject variability in the time drug first appeared in hair and the rate at which the drug moved along the hair shaft with time. These interindividual differences could not be explained by differences in plasma pharmacokinetics. Considered together, these results suggest that cocaine incorporation into hair may occur by way of multiple mechanisms--by way of sweat and sebum, for example--and at various times during the hair growth cycle. Thus, hair analysis using GC-MS appears to be a very sensitive method for detecting cocaine ingestion. However, within the range of doses used in the present study, hair does not provide a particularly accurate record of either the amount, time, or duration of drug use.
RNA-dependent RNA polymerase (RdRp) is essential to viral replication and is therefore one of the primary targets of countermeasures against these dangerous infectious agents. Development of broad-spectrum therapeutics targeting polymerases has been hampered by the extreme sequence variability of these sequences. RdRps range in length from 400–800 residues, yet contain only ∼20 residues that are conserved in most species. In this study, we made structure-based comparisons that are independent of sequence composition using a recently developed algorithm. We identified residue-to-residue correspondences of multiple protein structures and created (two-dimensional) structure-based alignment maps of 37 polymerase structures that provide both sequence and structure details. Using these maps, we determined that ∼75% of each polymerase species consists of seven protein segments, each of which has high structural similarity to segments in other species, though they are widely divergent in sequence composition and order. We define each of these segments as a ‘homomorph’, and each includes (though most are much larger than) the well-known conserved polymerase motifs. All homomorphs contact the template tunnel or nucleoside triphosphate (NTP) entry tunnel and the exterior of the protein, suggesting they constitute a structural and functional skeleton common among the polymerases.
Mutagenesis experiments on the baculovirus Bombyx mori nucleopolyhedrovirus (BmNPV) using 5-bromo-2h-deoxyuridine generated five mutants with a ' few polyhedra ' (FP) phenotype. Sequence analysis of the 25K gene homologue of the BmNPV FP mutants revealed nucleotide substitutions in the coding region. Rescue experiments indicated that the FP phenotype of the BmNPV mutants resulted from mutations in the 25K coding region. Effects of infection by these FP mutants were analysed following injection of the viruses into silkworm (B. mori) larvae. Compared to infection with wild-type virus, infection with each FP mutant resulted in reduced host degradation (liquefaction). The degree to which liquefaction was blocked corresponded to the degree of functional disruption of the 25K gene product and to the extent to which polyhedron production was reduced. Electron microscopy revealed that (1) polyhedron production was reduced, (2) very few virions were occluded and those that were lacked envelopes, and (3) the basal lamina of fatbody tissue was not destroyed by infection and accumulations of virions occurred along the membrane. Typical NPV-induced liquefaction was observed following infection with a polyhedrin deletion mutant, indicating that host degradation was not related to polyhedron production. These results suggest that (1) the 25K gene product is involved in the host degradation process caused by virus infection and (2) the FP phenotype is an indirect result of disruption of the 25K gene ; activation or suppression of a specific host or viral gene related to tissue degradation and polyhedron formation may be involved.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.