Metanat Hooshsadat scite author profile

Metanat Hooshsadat

4Publications

26Citation Statements Received

103Citation Statements Given

How they've been cited

How they cite others

103

Affiliations

Alberta Innovates, University of Alberta

Publications

Order By: Most citations

Breast cancer prediction using genome wide single nucleotide polymorphism data

et al. 2013

View full text Add to dashboard Cite

BackgroundThis paper introduces and applies a genome wide predictive study to learn a model that predicts whether a new subject will develop breast cancer or not, based on her SNP profile.ResultsWe first genotyped 696 female subjects (348 breast cancer cases and 348 apparently healthy controls), predominantly of Caucasian origin from Alberta, Canada using Affymetrix Human SNP 6.0 arrays. Then, we applied EIGENSTRAT population stratification correction method to remove 73 subjects not belonging to the Caucasian population. Then, we filtered any SNP that had any missing calls, whose genotype frequency was deviated from Hardy-Weinberg equilibrium, or whose minor allele frequency was less than 5%. Finally, we applied a combination of MeanDiff feature selection method and KNN learning method to this filtered dataset to produce a breast cancer prediction model. LOOCV accuracy of this classifier is 59.55%. Random permutation tests show that this result is significantly better than the baseline accuracy of 51.52%. Sensitivity analysis shows that the classifier is fairly robust to the number of MeanDiff-selected SNPs. External validation on the CGEMS breast cancer dataset, the only other publicly available breast cancer dataset, shows that this combination of MeanDiff and KNN leads to a LOOCV accuracy of 60.25%, which is significantly better than its baseline of 50.06%. We then considered a dozen different combinations of feature selection and learning method, but found that none of these combinations produces a better predictive model than our model. We also considered various biological feature selection methods like selecting SNPs reported in recent genome wide association studies to be associated with breast cancer, selecting SNPs in genes associated with KEGG cancer pathways, or selecting SNPs associated with breast cancer in the F-SNP database to produce predictive models, but again found that none of these models achieved accuracy better than baseline.ConclusionsWe anticipate producing more accurate breast cancer prediction models by recruiting more study subjects, providing more accurate labelling of phenotypes (to accommodate the heterogeneity of breast cancer), measuring other genomic alterations such as point mutations and copy number variations, and incorporating non-genetic information about subjects such as environmental and lifestyle factors.

show abstract

An Associative Classifier for Uncertain Datasets

Hooshsadat

Zaı̈ane

2012

View full text Add to dashboard Cite

Abstract. The classification of uncertain datasets is an emerging research problem that has recently attracted significant attention. Some attempts to devise a classification model with uncertain training data have been proposed using decision trees, neural networks, or other approaches. Among those, the associative classifiers have inspired some of the uncertain classification algorithms given their promising results on standard datasets. We propose a novel associative classifier for uncertain data. Our method, Uncertain Associative Classifier (UAC) is efficient and has an effective rule pruning strategy. Our experimental results on real datasets show that in most cases, UAC reaches better accuracies than the state of the art algorithms.

show abstract

Uapriori: An Algorithm for Finding Sequential Patterns in Probabilistic Data

Hooshsadat¹,

Bayat²,

Naeimi³

et al. 2012

View full text Add to dashboard Cite

Uncertainty in various domains implies the necessity for data mining techniques and algorithms that can handle uncertain datasets. Many studies on uncertain datasets have focused on modeling, query ranking, discovering frequent patterns, classification models, clustering, etc. However despite the existing need, not many studies have considered uncertainty in sequential data. This paper introduces UAprioriAll, a method to mine frequent sequences in the presence of uncertainty in transactions. UAprioriAll scales linearly in time relative to the size of the dataset.

show abstract

Fastest association rule mining algorithm predictor (FARM-AP)

Hooshsadat

Samuel

Patel

et al. 2011

View full text Add to dashboard Cite

Association rule mining is a particularly well studied field in data mining given its importance as a building block in many data analytics tasks. Many studies have focused on efficiency because the data to be mined is typically very large. However, while there are many approaches in literature, each approach claims to be the fastest for some given dataset. In other words, there is no clear winner. On the other hand, there is panoply of algorithms and implementations specifically designed for parallel computing. These solutions are typically implementations of sequential algorithms in a multi-processor configuration focusing on load balancing and data partitioning, each processor running the same implementation on it is own partition. The question we ask in this paper is whether there is a means to select the appropriate frequent itemset mining algorithm given a dataset and if each processor in a parallel implementation could select its own algorithm provided a given partition of the data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.