2020
DOI: 10.1186/s12859-020-03589-0
|View full text |Cite
|
Sign up to set email alerts
|

imputeqc: an R package for assessing imputation quality of genotypes and optimizing imputation parameters

Abstract: Background The imputation of genotypes increases the power of genome-wide association studies. However, the imputation quality should be assessed in each particular case. Nevertheless, not all imputation softwares control the error of output, e.g., the last release of fastPHASE program (1.4.8) lacks such an option. In this particular software there is also an uncertainty in choosing the model parameters. fastPHASE is based on haplotype clusters, which size should be set a priori. The parameter inf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 18 publications
0
7
0
Order By: Relevance
“…The R software (Version 4.0.5) was applied for data processing and DEGs identification. The Linear Models for Microarray Analysis (LIMMA) package and impute package were used to preprocess the 3 raw datasets, matching probe ID and gene names, removing missing values, normalizing data, and performing log2 conversions (16,17). The LIMMA package was used to merge the 3 datasets into an integrated dataset.…”
Section: Data Processing and Identification Of Degsmentioning
confidence: 99%
“…The R software (Version 4.0.5) was applied for data processing and DEGs identification. The Linear Models for Microarray Analysis (LIMMA) package and impute package were used to preprocess the 3 raw datasets, matching probe ID and gene names, removing missing values, normalizing data, and performing log2 conversions (16,17). The LIMMA package was used to merge the 3 datasets into an integrated dataset.…”
Section: Data Processing and Identification Of Degsmentioning
confidence: 99%
“…The hapFLK package v1.2 was used to detect selection signatures based on differences in haplotype frequencies between all the Merino-derived breeds included in this study ( Fariello et al, 2014 ). The number of haplotype clusters ( K ) was calculated using the imputeqc R package and accompanied scripts ( Khvorykh and Khrunin, 2020 ). Using the number of haplotype clusters, the hapFLK values and the kinship matrix were calculated in the fastPHASE model (- K 40).…”
Section: Methodsmentioning
confidence: 99%
“…H ap FLK takes the number of cluster of haplotypes as a parameter (K). To determine the number of clusters of haplotypes K, we ran FASTPHASE v1.4 (Scheet & Stephens 2006) and R package IMPUTEQ (Khvorykh & Khrunin 2020) on polymorphism data for the largest chromosome BCIN01. For each population, we used IMPUTEQ to generate five datasets with 10% of polymorphic positions masked, and for each masked dataset we used FASTPHASE for imputing masked positions assuming clusters of K=2,3…10 haplotypes and the following parameters: -T10 -C25 -H-1 -n -Z.…”
Section: Methodsmentioning
confidence: 99%