BackgroundMultiple infection sources for enterohemorrhagic Escherichia coli O157:H7 (EHEC) are known, including animal products, fruit and vegetables. The ecology of this pathogen outside its human host is largely unknown and one third of its annotated genes are still hypothetical. To identify genetic determinants expressed under a variety of environmental factors, we applied strand-specific RNA-sequencing, comparing the SOLiD and Illumina systems.ResultsTranscriptomes of EHEC were sequenced under 11 different biotic and abiotic conditions: LB medium at pH4, pH7, pH9, or at 15°C; LB with nitrite or trimethoprim-sulfamethoxazole; LB-agar surface, M9 minimal medium, spinach leaf juice, surface of living radish sprouts, and cattle feces. Of 5379 annotated genes in strain EDL933 (genome and plasmid), a surprising minority of only 144 had null sequencing reads under all conditions. We therefore developed a statistical method to distinguish weakly transcribed genes from background transcription. We find that 96% of all genes and 91.5% of the hypothetical genes exhibit a significant transcriptional signal under at least one condition. Comparing SOLiD and Illumina systems, we find a high correlation between both approaches for fold-changes of the induced or repressed genes. The pathogenicity island LEE showed highest transcriptional activity in LB medium, minimal medium, and after treatment with antibiotics. Unique sets of genes, including many hypothetical genes, are highly up-regulated on radish sprouts, cattle feces, or in the presence of antibiotics. Furthermore, we observed induction of the shiga-toxin carrying phages by antibiotics and confirmed active biofilm related genes on radish sprouts, in cattle feces, and on agar plates.ConclusionsSince only a minority of genes (2.7%) were not active under any condition tested (null reads), we suggest that the assumption of significant genome over-annotations is wrong. Environmental transcriptomics uncovered hitherto unknown gene functions and unique regulatory patterns in EHEC. For instance, the environmental function of azoR had been elusive, but this gene is highly active on radish sprouts. Thus, NGS-transcriptomics is an appropriate technique to propose new roles of hypothetical genes and to guide future research.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-353) contains supplementary material, which is available to authorized users.
BackgroundGene duplication is believed to be the classical way to form novel genes, but overprinting may be an important alternative. Overprinting allows entirely novel proteins to evolve de novo, i.e., formerly non-coding open reading frames within functional genes become expressed. Only three cases have been described for Escherichia coli. Here, a fourth example is presented.ResultsRNA sequencing revealed an open reading frame weakly transcribed in cow dung, coding for 101 residues and embedded completely in the −2 reading frame of citC in enterohemorrhagic E. coli. This gene is designated novel overlapping gene, nog1. The promoter region fused to gfp exhibits specific activities and 5’ rapid amplification of cDNA ends indicated the transcriptional start 40-bp upstream of the start codon. nog1 was strand-specifically arrested in translation by a nonsense mutation silent in citC. This Nog1-mutant showed a phenotype in competitive growth against wild type in the presence of MgCl2. Small differences in metabolite concentrations were also found. Bioinformatic analyses propose Nog1 to be inner membrane-bound and to possess at least one membrane-spanning domain. A phylogenetic analysis suggests that the orphan gene nog1 arose by overprinting after Escherichia/Shigella separated from the other γ-proteobacteria.ConclusionsSince nog1 is of recent origin, non-essential, short, weakly expressed and only marginally involved in E. coli’s central metabolism, we propose that this gene is in an initial stage of evolution. While we present specific experimental evidence for the existence of a fourth overlapping gene in enterohemorrhagic E. coli, we believe that this may be an initial finding only and overlapping genes in bacteria may be more common than is currently assumed by microbiologists.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-015-0558-z) contains supplementary material, which is available to authorized users.
Consider a large Boolean network with a feed forward structure. Given a probability distribution on the inputs, can one find, possibly small, collections of input nodes that determine the states of most other nodes in the network? To answer this question, a notion that quantifies the determinative power of an input over the states of the nodes in the network is needed. We argue that the mutual information (MI) between a given subset of the inputs X={X1,...,Xn} of some node i and its associated function fi(X) quantifies the determinative power of this set of inputs over node i. We compare the determinative power of a set of inputs to the sensitivity to perturbations to these inputs, and find that, maybe surprisingly, an input that has large sensitivity to perturbations does not necessarily have large determinative power. However, for unate functions, which play an important role in genetic regulatory networks, we find a direct relation between MI and sensitivity to perturbations. As an application of our results, we analyze the large-scale regulatory network of Escherichia coli. We identify the most determinative nodes and show that a small subset of those reduces the overall uncertainty of the network state significantly. Furthermore, the network is found to be tolerant to perturbations of its inputs.
BackgroundWhile NGS allows rapid global detection of transcripts, it remains difficult to distinguish ncRNAs from short mRNAs. To detect potentially translated RNAs, we developed an improved protocol for bacterial ribosomal footprinting (RIBOseq). This allowed distinguishing ncRNA from mRNA in EHEC. A high ratio of ribosomal footprints per transcript (ribosomal coverage value, RCV) is expected to indicate a translated RNA, while a low RCV should point to a non-translated RNA.ResultsBased on their low RCV, 150 novel non-translated EHEC transcripts were identified as putative ncRNAs, representing both antisense and intergenic transcripts, 74 of which had expressed homologs in E. coli MG1655. Bioinformatics analysis predicted statistically significant target regulons for 15 of the intergenic transcripts; experimental analysis revealed 4-fold or higher differential expression of 46 novel ncRNA in different growth media. Out of 329 annotated EHEC ncRNAs, 52 showed an RCV similar to protein-coding genes, of those, 16 had RIBOseq patterns matching annotated genes in other enterobacteriaceae, and 11 seem to possess a Shine-Dalgarno sequence, suggesting that such ncRNAs may encode small proteins instead of being solely non-coding. To support that the RIBOseq signals are reflecting translation, we tested the ribosomal-footprint covered ORF of ryhB and found a phenotype for the encoded peptide in iron-limiting condition.ConclusionDetermination of the RCV is a useful approach for a rapid first-step differentiation between bacterial ncRNAs and small mRNAs. Further, many known ncRNAs may encode proteins as well.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-017-3586-9) contains supplementary material, which is available to authorized users.
Transcriptional regulation networks are often modeled as Boolean networks. We discuss certain properties of Boolean functions (BFs), which are considered as important in such networks, namely, membership to the classes of unate or canalizing functions. Of further interest is the average sensitivity (AS) of functions. In this article, we discuss several algorithms to test the properties of interest. To test canalizing properties of functions, we apply spectral techniques, which can also be used to characterize the AS of functions as well as the influences of variables in unate BFs. Further, we provide and review upper and lower bounds on the AS of unate BFs based on the spectral representation. Finally, we apply these methods to a transcriptional regulation network of Escherichia coli, which controls central parts of the E. coli metabolism. We find that all functions are unate. Also the analysis of the AS of the network reveals an exceptional robustness against transient fluctuations of the binary variables.a
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.