2020
DOI: 10.1186/s12859-020-03865-z
|View full text |Cite
|
Sign up to set email alerts
|

A novel computational strategy for DNA methylation imputation using mixture regression model (MRM)

Abstract: Background DNA methylation is an important heritable epigenetic mark that plays a crucial role in transcriptional regulation and the pathogenesis of various human disorders. The commonly used DNA methylation measurement approaches, e.g., Illumina Infinium HumanMethylation-27 and -450 BeadChip arrays (27 K and 450 K arrays) and reduced representation bisulfite sequencing (RRBS), only cover a small proportion of the total CpG sites in the human genome, which considerably limited the scope of the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 57 publications
0
10
0
Order By: Relevance
“…Imputation is a technique where statistical inferences, assuming similar patterns are represented across samples, can be made on unobserved data points, such as CpG sites. The mixture regression model ( 99 ) is one imputation method that has been demonstrated to recover methylation data, achieving a correlation rate of 80% when up to 80% of the methylation data points have been deleted. Combining whole-genome bisulfite sequencing data from a subsample with microarray data of the wider sample as an input for the algorithm increases the prediction scope, while the cost of analysis is reduced.…”
Section: Resultsmentioning
confidence: 99%
“…Imputation is a technique where statistical inferences, assuming similar patterns are represented across samples, can be made on unobserved data points, such as CpG sites. The mixture regression model ( 99 ) is one imputation method that has been demonstrated to recover methylation data, achieving a correlation rate of 80% when up to 80% of the methylation data points have been deleted. Combining whole-genome bisulfite sequencing data from a subsample with microarray data of the wider sample as an input for the algorithm increases the prediction scope, while the cost of analysis is reduced.…”
Section: Resultsmentioning
confidence: 99%
“…Depending on the type of input, the methods for methylation prediction at individual base resolution can be generally classified into three categories. The first category includes methods that predict from coarse profiles obtained with MeDIP-Seq and Methylation-sensitive Restriction Enzyme sequencing (MRE-Seq) (Stevens et al, 2013), or methylation state of neighboring CpGs and methylation profile of other (related) samples (Ma et al, 2014;Kapourani and Sanguinetti, 2019;Yu et al, 2020;Tang et al, 2021), or additionally with the help from profiles for other epigenetic markers, such as histone modifications (Ernst and Kellis, 2015;Zou et al, 2018). Due to the availability of large amount of data for training, the most popularly used machine learning algorithm by these approaches is ensemble trees, either random forest or gradient boosting machines.…”
Section: Background and Related Workmentioning
confidence: 99%
“…However, when CpG sites on a chromosome are close in distance, this independence assumption can hardly hold; see, for example, Bell et al (2011), Eckhardt et al (2006). The correlation of CpG sites in DNAm has been utilized in DNAm prediction (Zhang et al (2015)) and imputation (Yu et al (2020)). To take into account correlations between CpG sites, this study employed the multivariate generalized beta distribution to model the blockwise correlation structure among CpG sites.…”
Section: Introductionmentioning
confidence: 99%