Detection of DNA cytosine modifications such as 5-methylcytosine (5mC) and 5-hydroxy-methylcytosine (5hmC) is essential for understanding the epigenetic changes that guide development, cellular lineage specification, and disease. The wide variety of approaches available to interrogate these modifications has created a need for harmonized materials, methods, and rigorous benchmarking to improve genome-wide methylome sequencing applications in clinical and basic research.We present a multi-platform assessment and a global resource for epigenetics research from the FDA’s Epigenomics Quality Control (EpiQC) Group. The study design leverages seven human cell lines that are publicly available from the National Institute of Standards and Technology (NIST) and Genome in a Bottle (GIAB) consortium. These genomes were subject to a variety of genome-wide methylation interrogation approaches across six independent laboratories. Our primary focus was on cytosine modifications found in mammalian genomes (5mC, 5hmC). Each sample was processed in two or more technical replicates by three whole-genome bisulfite sequencing (WGBS) protocols (TruSeq DNA methylation, Accel-NGS, SPLAT), oxidative bisulfite sequencing (oxBS), Enzymatic Methyl-seq (EM-seq), Illumina EPIC targeted-methylation sequencing, and ATAC-seq. Each library was sequenced to high coverage on an Illumina NovaSeq 6000. The data were subject to rigorous quality assessment and subsequently compared to Illumina EPIC methylation microarrays. We provide a wide range of sequence data for commonly used genomics reference materials, as well as best practices for epigenomics research. These findings can serve as a guide for researchers to enable epigenomic analysis of cellular identity in development, health, and disease.
The Illumina HiSeq X platform has helped to reduce the cost of whole genome sequencing substantially, but its application for bisulphite sequencing is not straightforward. We describe the 15 optimization of a library preparation and sequencing approach that maximizes the yield and quality of sequencing, and the elimination of a previously unrecognized artefact affecting several percent of bisulphite sequencing reads.While the comprehensive representation of the majority of the genome by whole genome 20 bisulphite sequencing (WGBS) makes it the optimal assay for testing DNA methylation 1 , up to now its cost has made it too expensive for many projects. The release of the Illumina HiSeq X reduced the cost of whole genome sequencing (WGS) substantially, prompting us to develop a new protocol based on this instrument to reduce the cost of WGBS to a comparable extent.The cost efficiency of the X system in part depends upon the libraries having large inserts that 25 allow 150 bp paired end sequencing to work effectively without a high fraction of overlapping reads, a practical problem when using DNA treated with sodium bisulphite, which has a degradative effect. Illumina provides the TruSeq DNA Methylation Kit for WGBS, which uses post-bisulphite adaptor tagging (PBAT) 2 , and recommends using 75 bp paired end sequencing, suitable for the shorter fragments from PBAT libraries. Apart from the insert size issue, the use 30 of patterned flow cells and different base calling software on the X system makes a transition from the use of earlier technologies potentially problematic, requiring the optimization of both library preparation and sequencing, as we describe below.We developed a new transposase-based approach that we call BS (bisulphite)-tagging, illustrated in Figure 1. In its use of transposases, the assay resembles prior tagmentation-based bisulphite 35 library preparation assays 3,4 , while the incorporation of 5mC for end repair is comparable with the T-WGBS approach used for very low input DNA amounts 5 , and creates an initial normal complexity sequence before reading into the (G+C)-depleted bisulphite-converted insert. Unlike T-WGBS, the extra expenses of a modified transposase and pre-methylated oligonucleotides are not required for There are three types of duplicate sequences that can be generated using the X system. The X system-specific problem, because of the use of patterned flow cells, is when a library fragment . CC-BY-NC 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/193193 doi: bioRxiv preprint first posted online Sep. 24, 2017; 2 occupying one well migrates or jumps into an adjacent well, referred to as a proximal duplicate. The second is the PCR duplicate, in which the same library fragment is amplified and is sequenced in different wells, while the third is the separate amplification of each of two complementary strands of DNA (complementary strand duplicate)....
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.