Summary Due to the rapidly increasing scale and diversity of epigenomic data, modular and scalable analysis workflows are of wide interest. Here we present snakePipes, a workflow package for processing and downstream analysis of data from common epigenomic assays: ChIP-seq, RNA-seq, Bisulfite-seq, ATAC-seq, Hi-C and single-cell RNA-seq. snakePipes enables users to assemble variants of each workflow and to easily install and upgrade the underlying tools, via its simple command-line wrappers and yaml files. Availability and implementation snakePipes can be installed via conda: `conda install -c mpi-ie -c bioconda -c conda-forge snakePipes’. Source code (https://github.com/maxplanck-ie/snakepipes) and documentation (https://snakepipes.readthedocs.io/en/latest/) are available online. Supplementary information Supplementary data are available at Bioinformatics online.
Highlights d Drosophila ELAV regulates all sites of neuronal alternative polyadenylation in vivo d ELAV directly binds to sites of neuron-specific splicing and 3 0 end processing d ELAV represses inclusion of an fne mini-exon that mediates FNE nuclear localization d In ELAV's absence, FNE rescues neuronal alternative polyadenylation and splicing
During embryogenesis, the genome shifts from transcriptionally quiescent to extensively active in a process known as Zygotic Genome Activation (ZGA). In Drosophila, the pioneer factor Zelda is known to be essential for the progression of development; still, it regulates the activation of only a small subset of genes at ZGA. However, thousands of genes do not require Zelda, suggesting that other mechanisms exist. By conducting GRO-seq, HiC and ChIP-seq in Drosophila embryos, we demonstrate that up to 65% of zygotically activated genes are enriched for the histone variant H2A.Z. H2A.Z enrichment precedes ZGA and RNA Polymerase II loading onto chromatin. In vivo knockdown of maternally contributed Domino, a histone chaperone and ATPase, reduces H2A.Z deposition at transcription start sites, causes global downregulation of housekeeping genes at ZGA, and compromises the establishment of the 3D chromatin structure. We infer that H2A.Z is essential for the de novo establishment of transcriptional programs during ZGA via chromatin reorganization.
The scale and diversity of epigenomics data has been rapidly increasing and ever more studies now present analyses of data from multiple epigenomic techniques. Performing such integrative analysis is time-consuming, especially for exploratory research, since there are currently no pipelines available that allow fast processing of datasets from multiple epigenomic assays while also allow for flexibility in running or upgrading the workflows. Here we present a solution to this problem : snakePipes, which can process and perform downstream analysis of data from all common epigenomic techniques (ChIP-seq, RNA-seq, Bisulfite-seq, ATAC-seq, Hi-C and single-cell RNA-seq) in a single package. We demonstrate how snakePipes can simplify integrative analysis by reproducing and extending the results from a recently published large-scale epigenomics study with a few simple commands. snakePipes are available under an open-source license at https://github.com/maxplanck-ie/snakepipes . MainEpigenomics is a fast growing field, and due to the consistent fall in the price of sequencing, increase in multiplexing abilities, and multiple innovations in laboratory protocols, it has become increasingly convenient to perform multiple epigenomic assays within a project. However, a major bottleneck on the way to process and analyse this data in a reproducible way, particularly for novice analysts, is the availability of analysis pipelines. Next-generation sequencing (NGS) analysis pipelines are composed of a series of data processing steps, employ standardised processing parameters, and are usually scalable to large number of samples 1 . Due to such properties, most pipelines are currently developed and deployed for settings where standardized, large scale analysis is required. Examples are RNA-seq variant-calling pipelines deployed in clinical settings 2 , or processing pipelines developed for large-scale consortia 3,4 . Figure S1. Analysis of de-repressed genes upon Schmd1 knock-out. A. PCA output from snakepipes suggested that one knock-out sample (replicate1) behaves differently. This sample was later revealed to be the XO clone which lost it's inactive X chromosome. The sample was removed for DESeq2 analysis and the workflow was re-run. B. Volcano plot for DESeq2 output from snakePipes (knock-out replicate 1 excluded), shows an increase in up-regulated genes, indicating de-repression upon knock-out. C. Wild-type ATAC-seq signal on UP, DOWN and unchanged (NONE) genes, gene lists were extracted from DESeq2 output of RNA-seq workflow and depth-normalized bigwigs from ATAC-seq workflow was used for plotting. D. Wild-type methylation level reported by the WGBS workflow on UP, DOWN and unchanged (NONE) genes. (TSS = Transcription Start Site)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.