Most models for predicting malignant pancreatic intraductal papillary mucinous neoplasms were developed based on logistic regression (LR) analysis. Our study aimed to develop risk prediction models using machine learning (ML) and LR techniques and compare their performances. This was a multinational, multi-institutional, retrospective study. Clinical variables including age, sex, main duct diameter, cyst size, mural nodule, and tumour location were factors considered for model development (MD). After the division into a MD set and a test set (2:1), the best ML and LR models were developed by training with the MD set using a tenfold cross validation. The test area under the receiver operating curves (AUCs) of the two models were calculated using an independent test set. A total of 3,708 patients were included. The stacked ensemble algorithm in the ML model and variable combinations containing all variables in the LR model were the most chosen during 200 repetitions. After 200 repetitions, the mean AUCs of the ML and LR models were comparable (0.725 vs. 0.725). The performances of the ML and LR models were comparable. The LR model was more practical than ML counterpart, because of its convenience in clinical use and simple interpretability.
Background Although we previously proposed a nomogram to predict malignancy in intraductal papillary mucinous neoplasms (IPMN) and validated it in an external cohort, its application is challenging without data on tumor markers. Moreover, existing nomograms have not been compared. This study aimed to develop a nomogram based on radiologic findings and to compare its performance with previously proposed American and Korean/Japanese nomograms. Methods We recruited 3708 patients who underwent surgical resection at 31 tertiary institutions in eight countries, and patients with main pancreatic duct >10 mm were excluded. To construct the nomogram, 2606 patients were randomly allocated 1:1 into training and internal validation sets, and area under the receiver operating characteristics curve (AUC) was calculated using 10‐fold cross validation by exhaustive search. This nomogram was then validated and compared to the American and Korean/Japanese nomograms using 1102 patients. Results Among the 2606 patients, 90 had main‐duct type, 900 had branch‐duct type, and 1616 had mixed‐type IPMN. Pathologic results revealed 1628 low‐grade dysplasia, 476 high‐grade dysplasia, and 502 invasive carcinoma. Location, cyst size, duct dilatation, and mural nodule were selected to construct the nomogram. AUC of this nomogram was higher than the American nomogram (0.691 vs 0.664, P = .014) and comparable with the Korean/Japanese nomogram (0.659 vs 0.653, P = .255). Conclusions A novel nomogram based on radiologic findings of IPMN is competitive for predicting risk of malignancy. This nomogram would be clinically helpful in circumstances where tumor markers are not available. The nomogram is freely available at http://statgen.snu.ac.kr/software/nomogramIPMN.
This paper presents a new statistical method for clustering step data, a popular form of health record data easily obtained from wearable devices. Since step data are high-dimensional and zeroinflated, classical methods such as K-means and partitioning around medoid (PAM) cannot be applied directly. The proposed method is a novel combination of newly constructed variables that reflect the inherent features of step data, such as quantity, strength, and pattern, and a multivariate functional principal component analysis that can integrate all the features of the step data for clustering. The proposed method is implemented by applying a conventional clustering method such as K-means and PAM to the multivariate functional principal component scores obtained from these variables. Simulation studies and real data analysis demonstrate significant improvement in clustering quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.