2000
DOI: 10.1086/302954
|View full text |Cite
|
Sign up to set email alerts
|

Data Mining Applied to Linkage Disequilibrium Mapping

Abstract: We introduce a new method for linkage disequilibrium mapping: haplotype pattern mining (HPM). The method, inspired by data mining methods, is based on discovery of recurrent patterns. We define a class of useful haplotype patterns in genetic case-control data and use the algorithm for finding disease-associated haplotypes. The haplotypes are ordered by their strength of association with the phenotype, and all haplotypes exceeding a given threshold level are used for prediction of disease susceptibility-gene lo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

1
102
0

Year Published

2001
2001
2014
2014

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 105 publications
(103 citation statements)
references
References 26 publications
(25 reference statements)
1
102
0
Order By: Relevance
“…For haplotype analysis we used the HPM method, which is powerful in locating a disease-causing gene when one is known to exist [22]. By allowing gaps in haplotype patterns, it is also more robust for genotyping errors, marker mutations, unrecognized recombinations, and missing data than the haplotype analysis we used in our first study [18,22].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…For haplotype analysis we used the HPM method, which is powerful in locating a disease-causing gene when one is known to exist [22]. By allowing gaps in haplotype patterns, it is also more robust for genotyping errors, marker mutations, unrecognized recombinations, and missing data than the haplotype analysis we used in our first study [18,22].…”
Section: Discussionmentioning
confidence: 99%
“…By allowing gaps in haplotype patterns, it is also more robust for genotyping errors, marker mutations, unrecognized recombinations, and missing data than the haplotype analysis we used in our first study [18,22]. Also, the HPM parameters can be optimized for each data set by studying individual features of genotype data such as marker information, marker density, and missing genotypes.…”
Section: Discussionmentioning
confidence: 99%
“…The overall time complexity of the algorithm is O (M N 2 ), where M is the total number of marker loci and N is the sample size, which is around hundreds in many real data sets. Preliminary experimental results on the simulated data given in Reference [185], and on an HLA real data set [190] with known disease gene location for type 1 diabetes, show that this method can predict gene location with high accuracy and its performance increases with denser markers. Phase information can also be obtained from the genotypes of family members, though this does not usually completely resolve phase ambiguity [48,52,191,192].…”
mentioning
confidence: 98%
“…The effect of violations of these assumptions is unpredictable in general. A non-parametric method called HPM (haplotype pattern mining) [185], inspired by data mining methods, has been proposed to identify disease associated haplotype patterns from case-control data. HPM does not require any assumptions about the inheritance pattern and has good localization power, even when the number of phenocopies, i.e.…”
mentioning
confidence: 99%
See 1 more Smart Citation