Streptococcus mutans is the leading cause of dental caries (tooth decay) worldwide and is considered to be the most cariogenic of all of the oral streptococci. The genome of S. mutans UA159, a serotype c strain, has been completely sequenced and is composed of 2,030,936 base pairs. It contains 1,963 ORFs, 63% of which have been assigned putative functions. The genome analysis provides further insight into how S. mutans has adapted to surviving the oral environment through resource acquisition, defense against host factors, and use of gene products that maintain its niche against microbial competitors. S. mutans metabolizes a wide variety of carbohydrates via nonoxidative pathways, and all of these pathways have been identified, along with the associated transport systems whose genes account for almost 15% of the genome. Virulence genes associated with extracellular adherent glucan production, adhesins, acid tolerance, proteases, and putative hemolysins have been identified. Strain UA159 is naturally competent and contains all of the genes essential for competence and quorum sensing. Mobile genetic elements in the form of IS elements and transposons are prominent in the genome and include a previously uncharacterized conjugative transposon and a composite transposon containing genes for the synthesis of antibiotics of the gramicidin͞bacitracin family; however, no bacteriophage genomes are present.
In 1995, the Institute for Genomic Research completed the genome sequence of a rough derivative of Haemophilus influenzae serotype d, strain KW20. Although extremely useful in understanding the basic biology of H. influenzae, these data have not provided significant insight into disease caused by nontypeable H. influenzae, as serotype d strains are not pathogens. In contrast, strains of nontypeable H. influenzae are the primary pathogens of chronic and recurrent otitis media in children. In addition, these organisms have an important role in acute otitis media in children as well as other respiratory diseases. Such strains must therefore contain a gene repertoire that differs from that of strain Rd. Elucidation of the differences between these genomes will thus provide insight into the pathogenic mechanisms of nontypeable H. influenzae. The genome of a representative nontypeable H. influenzae strain, 86-028NP, isolated from a patient with chronic otitis media was therefore sequenced and annotated. Despite large regions of synteny with the strain Rd genome, there are large rearrangements in strain 86-028NP's genome architecture relative to the strain Rd genome. A genomic island similar to an island originally identified in H. influenzae type b is present in the strain 86-028NP genome, while the mu-like phage present in the strain Rd genome is absent from the strain 86-028NP genome. Two hundred eighty open reading frames were identified in the strain 86-028NP genome that were absent from the strain Rd genome. These data provide new insight that complements and extends the ongoing analysis of nontypeable H. influenzae virulence determinants.In 1995 Haemophilus influenzae strain Rd, a rough derivative of H. influenzae serotype d strain KW20 (strain Rd hereafter), became the first free-living organism to have its genome sequenced to completion (34). Importantly, this also helped establish the large-scale shotgun approach, mated with the utilization of a scaffolding library and computer-assisted assembly, as a rational and expeditious approach for the sequencing of small bacterial genomes. Strain Rd was chosen as the prototypic bacterium for complete genome sequencing as it has a genome size representative of other bacteria and a GϩC content close to that of the human genome. Additionally, at the time of sequencing, a physical map of the strain Rd genome did not exist, so this genome was a good test for the approach of shotgun sequencing, scaffolding, and assembly (34).Although strain Rd is the exemplar organism for the current small-genome sequencing rationale and an important model organism for studying H. influenzae biology, strain Rd is a poor model for the study of pathogenicity caused by members of the genus Haemophilus. Serotype b strains of H. influenzae cause invasive diseases, for example, meningitis, and nontypeable H. influenzae (NTHi) strains principally have a role in localized respiratory disease, particularly in otitis media, acute sinusitis, and community-acquired pneumonia and have important conseque...
To ensure survival, most bacteria must acquire iron, a resource that is sequestered by mammalian hosts. Pathogenic bacteria have therefore evolved intricate systems to sense iron limitation and regulate gene expression appropriately. We used a pan-Neisseria microarray to examine genes regulated in Neisseria gonorrhoeae in response to iron availability in defined medium. Overall, 203 genes varied in expression, 109 up-regulated and 94 down-regulated by iron deprivation. In iron-replete medium, genes essential to rapid bacterial growth were preferentially expressed, while iron transport functions, and predominantly genes of unknown function, were expressed in low-iron medium. Of those TonB-dependent proteins encoded in the FA1090 genome with unknown ligand specificity, expression of three was not controlled by iron availability, suggesting that these receptors may not be high-affinity transporters for iron-containing ligands. Approximately 30% of the operons regulated by iron appeared to be directly under control of Fur. Our data suggest a regulatory cascade where Fur indirectly controls gene expression by affecting the transcription of three secondary regulators. Our data also suggest that a second MerR-like regulator may be directly responding to iron availability and controlling transcription independent of the Fur protein. Comparison of our data with those recently published for Neisseria meningitidis revealed that only a small portion of genes were found to be similarly regulated in these closely related pathogens, while a large number of genes derepressed during iron starvation were unique to each organism.
The goal of pharmacovigilance is to detect, monitor, characterize and prevent adverse drug events (ADEs) with pharmaceutical products. This article is a comprehensive structured review of recent advances in applying natural language processing (NLP) to electronic health record (EHR) narratives for pharmacovigilance. We review methods of varying complexity and problem focus, summarize the current state-of-the-art in methodology advancement, discuss limitations and point out several promising future directions. The ability to accurately capture both semantic and syntactic structures in clinical narratives becomes increasingly critical to enable efficient and accurate ADE detection. Significant progress has been made in algorithm development and resource construction since 2000. Since 2012, statistical analysis and machine learning methods have gained traction in automation of ADE mining from EHR narratives. Current state-of-the-art methods for NLP-based ADE detection from EHRs show promise regarding their integration into production pharmacovigilance systems. In addition, integrating multifaceted, heterogeneous data sources has shown promise in improving ADE detection and has become increasingly adopted. On the other hand, challenges and opportunities remain across the frontier of NLP application to EHR-based pharmacovigilance, including proper characterization of ADE context, differentiation between off- and on-label drug-use ADEs, recognition of the importance of polypharmacy-induced ADEs, better integration of heterogeneous data sources, creation of shared corpora, and organization of shared-task challenges to advance the state-of-the-art.
Nucleic acid-binding proteins are involved in a great number of cellular processes. Understanding the mechanisms underlying these proteins first requires the identification of specific residues involved in nucleic acid binding. Prediction of NA-binding residues can provide practical assistance in the functional annotation of NA-binding proteins. Predictions can also be used to expedite mutagenesis experiments, guiding researchers to the correct binding residues in these proteins. Here, we present a method for the identification of amino acid residues involved in DNA- and RNA-binding using sequence-based attributes. The method used in this work combines the C4.5 algorithm with bootstrap aggregation and cost-sensitive learning. Our DNA-binding model achieved 79.1% accuracy, while the RNA-binding model reached an accuracy of 73.2%. The NAPS web server is freely available at http://proteomics.bioengr.uic.edu/NAPS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.