2009 IEEE International Conference on Bioinformatics and Biomedicine 2009
DOI: 10.1109/bibm.2009.89
|View full text |Cite
|
Sign up to set email alerts
|

Differential Predictive Modeling for Racial Disparities in Breast Cancer

Abstract: The primary objective of disparities research is to model the differences across multiple groups and identify the groups that behave significantly different from each other. Independently generating various decision trees for different subsets of the data will not allow us to study the impact of the various attributes on these different subgroups. We propose a novel technique for inducing similar decision trees for different subpopulations and also develop a new distance metric between two decision trees which… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2011
2011
2021
2021

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 13 publications
0
4
0
Order By: Relevance
“…The binary datasets are represented by triplet (dataset, attributes, instances). The UCI datasets used are (blood, 5, 748), (liver, 6, 345), (diabetes, 8, 768), (gamma, 11, 19020), and (heart, 22,267). Synthetic datasets used in our work have 500,000 to 1 million tuples.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The binary datasets are represented by triplet (dataset, attributes, instances). The UCI datasets used are (blood, 5, 748), (liver, 6, 345), (diabetes, 8, 768), (gamma, 11, 19020), and (heart, 22,267). Synthetic datasets used in our work have 500,000 to 1 million tuples.…”
Section: Resultsmentioning
confidence: 99%
“…(1) Dataset Distribution Differences -Despite the importance of the problem, only a small amount of work is available in describing the differences between two data distributions. Earlier approaches for measuring the deviation between two datasets used simple data statistics after decomposing the feature space into smaller regions using tree based models [22,12]. However, the final result obtained is a data-dependent measure and do not give any understanding about the features responsible for measuring that difference.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Haller et al 2012Gene expression classification using SVM Vanitha et al (2015) Ortholog detection in yeast species using imbalanced classification approaches including SVM Galpert et al (2015) Evolutionary feature selection using SVM and other techniques using MapReduce Peralta et al (2015) Genomic feature learning using SVMrecursive feature elimination algorithm Anaissi et al (2016) Decision trees Employing decision tree learning for processing of large datasets Hall et al (1998) RainForest, a framework supporting construction of fast decision tree for classification of large datasets Johannes Gehrke et al 2000Predictive decision tree model for analysing racial disparities in breast cancer Palit et al (2009) A streaming parallel decision tree algorithm for classification of largescale datasets and streaming data…”
Section: Logistic Regressionmentioning
confidence: 99%