Universal correction of enzymatic sequence bias reveals molecular signatures of protein/DNA interactions

Martins, André L.; Walavalkar, Ninad M.; Anderson, W. T.; Zang, Chongzhi; Guertin, Michael J.

doi:10.1101/104364

Cited by 2 publications

(2 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The Tn5 transposase also has a sequence bias, which would need to be corrected before an effective footprint analysis could occur. Several methods exist(45,46) that could correct the sequence bias from an ATAC-seq library. HMMRATAC could be used downstream of such sequence corrections, to reduce bias’ in peak calling, or upstream of the corrections, to identify the reads within peaks that need to be corrected for footprint or other downstream analysis.…”

Section: Discussionmentioning

confidence: 99%

HMMRATAC: a Hidden Markov ModeleR for ATAC-seq

Tarbell

Liu

2018

Preprint

View full text Add to dashboard Cite

9 1 0ATAC-seq has been widely adopted to identify accessible chromatin regions across the genome.1 1 However, current data analysis still utilizes approaches initially designed for ChIP-seq or DNase-1 2 seq, without taking into account the transposase digested DNA fragments that contain additional 1 3 nucleosome positioning information. We present the first dedicated ATAC-seq analysis tool, a 1 4 semi-supervised machine learning approach named HMMRATAC. HMMRATAC splits a single 1 5 ATAC-seq dataset into nucleosome-free and nucleosome-enriched signals, learns the unique 1 6 chromatin structure around accessible regions, and then predicts accessible regions across the 1 7 entire genome. We show that HMMRATAC outperforms the popular peak-calling algorithms on 1 8 published human and mouse ATAC-seq datasets.

show abstract

Section: Discussionmentioning

confidence: 99%

HMMRATAC: a Hidden Markov ModeleR for ATAC-seq

Tarbell

Liu

2018

Preprint

View full text Add to dashboard Cite

show abstract

“…We strand-separated the aligned bam files using . We used (Martins et al, 2018) to simultaneously shift the alignments to represent the 3′end of the RNA and convert the BAM to the bigWig format. We merged the strand-specific bigWig files from all replicates using the UCSC Genome Browser Utilities (Kent et al, 2010).…”

Section: Supplementalmentioning

confidence: 99%

Defining data-driven primary transcript annotations with primaryTranscriptAnnotation in R

Anderson

Duarte

Civelek

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

Nascent transcript measurements derived from run-on sequencing experiments are critical for the investigation of transcriptional mechanisms and regulatory networks. However, conventional gene annotations specify the boundaries of mRNAs, which significantly differ from the boundaries of primary transcripts. Moreover, transcript isoforms with distinct transcription start and end coordinates can vary between cell types. Therefore, new primary transcript annotations are needed to accurately interpret run-on data. We developed the primaryTranscriptAnnotation R package to infer the transcriptional start and termination sites of annotated genes from genomic run-on data. We then used these inferred coordinates to annotate transcriptional units identified de novo. Hence, this package provides the novel utility to integrate datadriven primary transcript annotations with transcriptional unit coordinates identified in an unbiased manner. Our analyses demonstrated that this new methodology increases the sensitivity for detecting differentially expressed transcripts and provides more accurate quantification of RNA polymerase pause indices, consistent with the importance of using accurate primary transcript coordinates for interpreting genomic nascent transcription data. Availability: https://github.com/ WarrenDavidAnderson/genomicsRpackage/ tree/master/primaryTranscriptAnnotation PRO-seq/GRO-seq analysis | transcript annotation | RNA Polymerase PausingCorrespondence: warrena@virginia.edu Technology Core for sequencing our PRO-seq libraries.

show abstract

Universal correction of enzymatic sequence bias reveals molecular signatures of protein/DNA interactions

Cited by 2 publications

References 35 publications

HMMRATAC: a Hidden Markov ModeleR for ATAC-seq

HMMRATAC: a Hidden Markov ModeleR for ATAC-seq

Defining data-driven primary transcript annotations with primaryTranscriptAnnotation in R

Contact Info

Product

Resources

About