The secretion of biomolecules into the extracellular milieu is a common and well-conserved phenomenon in biology. In bacteria, secreted biomolecules are not only involved in intra-species communication but they also play roles in inter-kingdom exchanges and pathogenicity. To date, released products, such as small molecules, DNA, peptides, and proteins, have been well studied in bacteria. However, the bacterial extracellular RNA complement has so far not been comprehensively characterized. Here, we have analyzed, using a combination of physical characterization and high-throughput sequencing, the extracellular RNA complement of both outer membrane vesicle (OMV)-associated and OMV-free RNA of the enteric Gram-negative model bacterium Escherichia coli K-12 substrain MG1655 and have compared it to its intracellular RNA complement. Our results demonstrate that a large part of the extracellular RNA complement is in the size range between 15 and 40 nucleotides and is derived from specific intracellular RNAs. Furthermore, RNA is associated with OMVs and the relative abundances of RNA biotypes in the intracellular, OMV and OMV-free fractions are distinct. Apart from rRNA fragments, a significant portion of the extracellular RNA complement is composed of specific cleavage products of functionally important structural noncoding RNAs, including tRNAs, 4.5S RNA, 6S RNA, and tmRNA. In addition, the extracellular RNA pool includes RNA biotypes from cryptic prophages, intergenic, and coding regions, of which some are so far uncharacterised, for example, transcripts mapping to the fimA-fimL and ves-spy intergenic regions. Our study provides the first detailed characterization of the extracellular RNA complement of the enteric model bacterium E. coli. Analogous to findings in eukaryotes, our results suggest the selective export of specific RNA biotypes by E. coli, which in turn indicates a potential role for extracellular bacterial RNAs in intercellular communication.
The primary problem with the explosion of biomedical datasets is not the data, not computational resources, and not the required storage space, but the general lack of trained and skilled researchers to manipulate and analyze these data. Eliminating this problem requires development of comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching of data analytics in life sciences and facilitates the development of training materials. The key feature of our system is that it is not a static but a continuously improved collection of tutorials. By coupling tutorials with a web-based analysis framework, biomedical researchers can learn by performing computation themselves through a web browser without the need to install software or search for example datasets. Our ultimate goal is to expand the breadth of training materials to include fundamental statistical and data science topics and to precipitate a complete re-engineering of undergraduate and graduate curricula in life sciences. This project is accessible at https://training.galaxyproject.org.
A new assessment criterion for docking poses is proposed in which experimental electron density is taken into account when evaluating the ability of docking programs to reproduce experimentally observed binding modes. Three docking programs (Gold, Glide, and Fred) were used to generate poses for a set of 88 protein-ligand complexes for which the crystal structure is known. The new criterion is based on the real space R-factor (RSR), which measures how well a group of atoms-in our case the ligand-fits the experimental electron density by comparing that density to the expected density, calculated from the model (i.e., the predicted ligand pose). The RSR-based measure is compared to the traditional criterion, the root-mean-square distance (RMSD) between the docking pose and the binding configuration in the crystallographic model. The results highlight several shortcomings of the RMSD criterion that do not affect the RSR-based measure. Examples illustrate that the RSR-derived approach allows a more meaningful a posteriori assessment of docking methods and results. Practical implications for docking evaluations and for methodological development work in this field are discussed.
A B S T R A C TDNA methylation is one of the main epigenetic modifications in the eukaryotic genome; it has been shown to play a role in cell-type specific regulation of gene expression, and therefore cell-type identity. Bisulfite sequencing is the gold-standard for measuring methylation over the genomes of interest. Here, we review several techniques used for the analysis of high-throughput bisulfite sequencing. We introduce specialized short-read alignment techniques as well as pre/post-alignment quality check methods to ensure data quality. Furthermore, we discuss subsequent analysis steps after alignment. We introduce various differential methylation methods and compare their performance using simulated and real bisulfite sequencing datasets. We also discuss the methods used to segment methylomes in order to pinpoint regulatory regions. We introduce annotation methods that can be used for further classification of regions returned by segmentation and differential methylation methods. Finally, we review software packages that implement strategies to efficiently deal with large bisulfite sequencing datasets locally and we discuss online analysis workflows that do not require any prior programming skills. The analysis strategies described in this review will guide researchers at any level to the best practices of bisulfite sequencing analysis.
Myeloid cells such as resident retinal microglia (MG) or infiltrating blood‐derived macrophages (Mϕ) accumulate in areas of retinal ischemia and neovascularization (RNV) and modulate neovascular eye disease. Their temporospatial distribution and biological function in this process, however, remain unclarified. Using state‐of‐the‐art methods, including cell‐specific reporter mice and high‐throughput RNA sequencing (RNA Seq), this study determined the extent of MG proliferation and Mϕ infiltration in areas with retinal ischemia and RNV in Cx3cr1CreERT2:Rosa26‐tdTomato mice and examined the transcriptional profile of MG in the mouse model of oxygen‐induced retinopathy (OIR). For RNA Seq, tdTomato‐positive retinal MG were sorted by flow cytometry followed by Gene ontology (GO) cluster analysis. Furthermore, intraperitoneal injections of the cell proliferation marker 5‐ethynyl‐2′‐deoxyuridine (EdU) were performed from postnatal day (p) 12 to p16. We found that MG is the predominant myeloid cell population while Mϕ rarely appears in areas of RNV. Thirty percent of retinal MG in areas of RNV were EdU‐positive indicating a considerable local MG cell expansion. GO cluster analysis revealed an enrichment of clusters related to cell division, tubulin binding, ATPase activity, protein kinase regulatory activity, and chemokine receptor binding in MG in the OIR model compared to untreated controls. In conclusion, activated retinal MG alter their transcriptional profile, exhibit considerable proliferative ability and are by far the most frequent myeloid cell population in areas of ischemia and RNV in the OIR model thus presenting a potential target for future therapeutic approaches.
Background DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq. Results Here, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite the differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impact the discrimination of footprint from the background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints. Conclusions We demonstrate that the impact of bias correction on footprinting performance is greater for DNase-seq than for ATAC-seq and that DNase-seq footprinting leads to better performance. It is possible to infer concordant footprints by using replicates, highlighting the importance of reproducibility assessment. The results presented here provide an overview of the advantages and limitations of footprinting analyses using ATAC-seq and DNase-seq. Electronic supplementary material The online version of this article (10.1186/s13059-019-1654-y) contains supplementary material, which is available to authorized users.
RNA-binding proteins (RBPs) control and coordinate each stage in the life cycle of RNAs. Although in vivo binding sites of RBPs can now be determined genome-wide, most studies typically focused on individual RBPs. Here, we examined a large compendium of 114 high-quality transcriptome-wide in vivo RBP–RNA cross-linking interaction datasets generated by the same protocol in the same cell line and representing 64 distinct RBPs. Comparative analysis of categories of target RNA binding preference, sequence preference, and transcript region specificity was performed, and identified potential posttranscriptional regulatory modules, i.e. specific combinations of RBPs that bind to specific sets of RNAs and targeted regions. These regulatory modules represented functionally related proteins and exhibited distinct differences in RNA metabolism, expression variance, as well as subcellular localization. This integrative investigation of experimental RBP–RNA interaction evidence and RBP regulatory function in a human cell line will be a valuable resource for understanding the complexity of post-transcriptional regulation.
BackgroundSequencing-based analyses of low-biomass samples are known to be prone to misinterpretation due to the potential presence of contaminating molecules derived from laboratory reagents and environments. DNA contamination has been previously reported, yet contamination with RNA is usually considered to be very unlikely due to its inherent instability. Small RNAs (sRNAs) identified in tissues and bodily fluids, such as blood plasma, have implications for physiology and pathology, and therefore the potential to act as disease biomarkers. Thus, the possibility for RNA contaminants demands careful evaluation.ResultsHerein, we report on the presence of small RNA (sRNA) contaminants in widely used microRNA extraction kits and propose an approach for their depletion. We sequenced sRNAs extracted from human plasma samples and detected important levels of non-human (exogenous) sequences whose source could be traced to the microRNA extraction columns through a careful qPCR-based analysis of several laboratory reagents. Furthermore, we also detected the presence of artefactual sequences related to these contaminants in a range of published datasets, thereby arguing in particular for a re-evaluation of reports suggesting the presence of exogenous RNAs of microbial and dietary origin in blood plasma. To avoid artefacts in future experiments, we also devise several protocols for the removal of contaminant RNAs, define minimal amounts of starting material for artefact-free analyses, and confirm the reduction of contaminant levels for identification of bona fide sequences using ‘ultra-clean’ extraction kits.ConclusionThis is the first report on the presence of RNA molecules as contaminants in RNA extraction kits. The described protocols should be applied in the future to avoid confounding sRNA studies.Electronic supplementary materialThe online version of this article (10.1186/s12915-018-0522-7) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.