Hyunkyung Kim scite author profile

Objective:Despite existing prognostic markers, breast cancer prognosis remains a difficult subject due to the complex relationships between many contributing factors and survival. This study seeks to integrate multiple clinicopathological and genomic factors with dimensional reduction across machine learning algorithms to compare survival predictions.Methods:This is a secondary analysis of the data from a prospective cohort study of female patients with breast cancer enrolled in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC). We constructed a series of predictive models: ensemble models (Gradient Boosting and Random Forest), support vector machine (SVM), and artificial neural networks (ANN) for 5-year survival based on clinicopathological and gene expression data after K-means clustering with K-nearest-neighbor (KNN) classification. Model performance was evaluated by receiver operating characteristic (ROC) curve, accuracy, and calibration slope (CS). Model stability was assessed over 10 random runs in terms of ROC, accuracy, CS, and variable importance.Results:The analytic cohort is composed of 1874 patients with breast cancer. Overall, the median age was 62 years; the 5-year survival rate was 75%. ROC and accuracy were not significantly different between models (ROC and accuracy around 0.67 and 0.72 across models, respectively). However, ensemble methods resulted in better fit (CS) with stable measures of variable importance across 10 random training/validation splits. K-means clustering of gene expression profiles on training data points along with KNN classification of validation data points was a robust method of dimensional reduction. Furthermore, the gene expression cluster with the highest mortality risk was an influential factor in model prediction.Conclusions:Using machine learning methods to construct predictive models for 5-year survival in patients with breast cancer, we demonstrated discrimination ability across models with new insight into the stability and utility of dimensional reduction on genomic features in breast cancer survival prediction.

show abstract

High-throughput genetic clustering of type 2 diabetes loci reveals heterogeneous mechanistic pathways of metabolic disease

Kim

Westerman

Smith

et al. 2022

Diabetologia

View full text Add to dashboard Cite

Aims/hypothesis Type 2 diabetes is highly polygenic and influenced by multiple biological pathways. Rapid expansion in the number of type 2 diabetes loci can be leveraged to identify such pathways. Methods We developed a high-throughput pipeline to enable clustering of type 2 diabetes loci based on variant-trait associations. Our pipeline extracted summary statistics from genome-wide association studies (GWAS) for type 2 diabetes and related traits to generate a matrix of 323 variants × 64 trait associations and applied Bayesian non-negative matrix factorisation (bNMF) to identify genetic components of type 2 diabetes. Epigenomic enrichment analysis was performed in 28 cell types and single pancreatic cells. We generated cluster-specific polygenic scores and performed regression analysis in an independent cohort (N=25,419) to assess for clinical relevance. Results We identified ten clusters of genetic loci, recapturing the five from our prior analysis as well as novel clusters related to beta cell dysfunction, pronounced insulin secretion, and levels of alkaline phosphatase, lipoprotein A and sex hormone-binding globulin. Four clusters related to mechanisms of insulin deficiency, five to insulin resistance and one had an unclear mechanism. The clusters displayed tissue-specific epigenomic enrichment, notably with the two beta cell clusters differentially enriched in functional and stressed pancreatic beta cell states. Additionally, cluster-specific polygenic scores were differentially associated with patient clinical characteristics and outcomes. The pipeline was applied to coronary artery disease and chronic kidney disease, identifying multiple overlapping clusters with type 2 diabetes. Conclusions/interpretation Our approach stratifies type 2 diabetes loci into physiologically interpretable genetic clusters associated with distinct tissues and clinical outcomes. The pipeline allows for efficient updating as additional GWAS become available and can be readily applied to other conditions, facilitating clinical translation of GWAS findings. Software to perform this clustering pipeline is freely available.

show abstract

Anterior Segment Parameters Using Pentacam and Prediction of Corneal Endothelial Cell Loss after Cataract Surgery

Cho

Chang

et al. 2010

Korean J Ophthalmol

View full text Add to dashboard Cite

PurposeWe evaluated various preoperative anterior segment parameters measured with a Pentacam rotating Scheimpflug camera and compared them with those of conventional methods. We also evaluated the effect of different parameters on corneal endothelial cells after cataract surgery.MethodsPentacam examination was performed in 88 eyes from 88 patients to evaluate central anterior chamber depth (ACDpentacam), nuclear density (Densitometrypentacam), anterior chamber volume (ACV), and lens thickness (LTpentacam). We compared values of ACDpentacam with those of ultrasound (ACDsono) and also compared Densitometrypentacam values with those of Lens Opacities Classification System (LOCS III) classification. We evaluated the effect of the following preoperative values measured with Pentacam on postoperative endothelial cell loss: pupil size measured both preoperatively and before capsulorrhexsis (PupilCCC), amount of viscoelastics, and LT measured by ultrasound (LTsono).ResultsA significant concordance was found between the two grading methods of nuclear opacity: Densitometrypentacam and LOCS III classification (τb = 0.414, p = 0.000). We also found a positive correlation between ACDpentacam and ACDsono (r = 0.823, p = 0.000) and between ACDpentacam and ACV (r = 0.650, p = 0.000). There were significant differences between the results of LTpentacam and LTsono. The final regression model identified Densitometrypentacam, viscoelastics and PupilCCC as independent predictors of decreased postoperative corneal endothelial cell density (CD) at postoperative day 3, and Densitometrypentacam, viscoelastics, and ACV as independent predictors of decreased CD two months postoperatively (p<0.05).ConclusionsGood agreement was found between all results obtained with the Pentacam and conventional methods except LT. Analyzing anterior chamber parameters preoperatively using Pentacam could be helpful to predict postoperative endothelial cell loss.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hyunkyung Kim

Machine Learning With K-Means Dimensional Reduction for Predicting Survival Outcomes in Patients With Breast Cancer

High-throughput genetic clustering of type 2 diabetes loci reveals heterogeneous mechanistic pathways of metabolic disease

Anterior Segment Parameters Using Pentacam and Prediction of Corneal Endothelial Cell Loss after Cataract Surgery

Contact Info

Product

Resources

About