Background Predictions in pregnancy care are complex because of interactions among multiple factors. Hence, pregnancy outcomes are not easily predicted by a single predictor using only one algorithm or modeling method. Objective This study aims to review and compare the predictive performances between logistic regression (LR) and other machine learning algorithms for developing or validating a multivariable prognostic prediction model for pregnancy care to inform clinicians’ decision making. Methods Research articles from MEDLINE, Scopus, Web of Science, and Google Scholar were reviewed following several guidelines for a prognostic prediction study, including a risk of bias (ROB) assessment. We report the results based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Studies were primarily framed as PICOTS (population, index, comparator, outcomes, timing, and setting): Population: men or women in procreative management, pregnant women, and fetuses or newborns; Index: multivariable prognostic prediction models using non-LR algorithms for risk classification to inform clinicians’ decision making; Comparator: the models applying an LR; Outcomes: pregnancy-related outcomes of procreation or pregnancy outcomes for pregnant women and fetuses or newborns; Timing: pre-, inter-, and peripregnancy periods (predictors), at the pregnancy, delivery, and either puerperal or neonatal period (outcome), and either short- or long-term prognoses (time interval); and Setting: primary care or hospital. The results were synthesized by reporting study characteristics and ROBs and by random effects modeling of the difference of the logit area under the receiver operating characteristic curve of each non-LR model compared with the LR model for the same pregnancy outcomes. We also reported between-study heterogeneity by using τ2 and I2. Results Of the 2093 records, we included 142 studies for the systematic review and 62 studies for a meta-analysis. Most prediction models used LR (92/142, 64.8%) and artificial neural networks (20/142, 14.1%) among non-LR algorithms. Only 16.9% (24/142) of studies had a low ROB. A total of 2 non-LR algorithms from low ROB studies significantly outperformed LR. The first algorithm was a random forest for preterm delivery (logit AUROC 2.51, 95% CI 1.49-3.53; I2=86%; τ2=0.77) and pre-eclampsia (logit AUROC 1.2, 95% CI 0.72-1.67; I2=75%; τ2=0.09). The second algorithm was gradient boosting for cesarean section (logit AUROC 2.26, 95% CI 1.39-3.13; I2=75%; τ2=0.43) and gestational diabetes (logit AUROC 1.03, 95% CI 0.69-1.37; I2=83%; τ2=0.07). Conclusions Prediction models with the best performances across studies were not necessarily those that used LR but also used random forest and gradient boosting that also performed well. We recommend a reanalysis of existing LR models for several pregnancy outcomes by comparing them with those algorithms that apply standard guidelines. Trial Registration PROSPERO (International Prospective Register of Systematic Reviews) CRD42019136106; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=136106
Background: We developed and validated an artificial intelligence (AI)-assisted prediction of preeclampsia applied to a nationwide health insurance dataset in Indonesia. Methods: The BPJS Kesehatan dataset have been preprocessed using a nested case-control design into preeclampsia/eclampsia (n = 3318) and normotensive pregnant women (n = 19,883) from all women with one pregnancy. The dataset provided 95 features consisting of demographic variables and medical histories started from 24 months to event and ended by delivery as the event. Six algorithms were compared by area under the receiver operating characteristics curve (AUROC) with a subgroup analysis by time to the event. We compared our model to similar prediction models from systematically reviewed studies. In addition, we conducted a text mining analysis based on natural language processing techniques to interpret our modeling results.Findings: The best model consisted of 17 predictors extracted by a random forest algorithm. Nine»12 months to the event was the period that had the best AUROC in external validation by either geographical (0.88, 95% confidence interval (CI) 0.88À0.89) or temporal split (0.86, 95% CI 0.85À0.86). We compared this model to prediction models in seven studies from 869 records in PUBMED, EMBASE, and SCOPUS. This model outperformed the previous models in terms of the precision, sensitivity, and specificity in all validation sets. Interpretation: Our low-cost model improved preliminary prediction to decide pregnant women that will be predicted by the models with high specificity and advanced predictors.
Background Preeclampsia and intrauterine growth restriction are placental dysfunction–related disorders (PDDs) that require a referral decision be made within a certain time period. An appropriate prediction model should be developed for these diseases. However, previous models did not demonstrate robust performances and/or they were developed from datasets with highly imbalanced classes. Objective In this study, we developed a predictive model of PDDs by machine learning that uses features at 24-37 weeks’ gestation, including maternal characteristics, uterine artery (UtA) Doppler measures, soluble fms-like tyrosine kinase receptor-1 (sFlt-1), and placental growth factor (PlGF). Methods A public dataset was taken from a prospective cohort study that included pregnant women with PDDs (66/95, 69%) and a control group (29/95, 31%). Preliminary selection of features was based on a statistical analysis using SAS 9.4 (SAS Institute). We used Weka (Waikato Environment for Knowledge Analysis) 3.8.3 (The University of Waikato, Hamilton, NZ) to automatically select the best model using its optimization algorithm. We also manually selected the best of 23 white-box models. Models, including those from recent studies, were also compared by interval estimation of evaluation metrics. We used the Matthew correlation coefficient (MCC) as the main metric. It is not overoptimistic to evaluate the performance of a prediction model developed from a dataset with a class imbalance. Repeated 10-fold cross-validation was applied. Results The classification via regression model was chosen as the best model. Our model had a robust MCC (.93, 95% CI .87-1.00, vs .64, 95% CI .57-.71) and specificity (100%, 95% CI 100-100, vs 90%, 95% CI 90-90) compared to each metric of the best models from recent studies. The sensitivity of this model was not inferior (95%, 95% CI 91-100, vs 100%, 95% CI 92-100). The area under the receiver operating characteristic curve was also competitive (0.970, 95% CI 0.966-0.974, vs 0.987, 95% CI 0.980-0.994). Features in the best model were maternal weight, BMI, pulsatility index of the UtA, sFlt-1, and PlGF. The most important feature was the sFlt-1/PlGF ratio. This model used an M5P algorithm consisting of a decision tree and four linear models with different thresholds. Our study was also better than the best ones among recent studies in terms of the class balance and the size of the case class (66/95, 69%, vs 27/239, 11.3%). Conclusions Our model had a robust predictive performance. It was also developed to deal with the problem of a class imbalance. In the context of clinical management, this model may improve maternal mortality and neonatal morbidity and reduce health care costs.
We aimed to provide a framework that organizes internal properties of a convolutional neural network (CNN) model using non-image data to be interpretable by human. The interface was represented as ontology map and network respectively by dimensional reduction and hierarchical clustering techniques. The applicability is to implement a prediction model either to classify categorical or to estimate numerical outcome, including but not limited to that using data from electronic health records. This pipeline harnesses invention of CNN algorithms for non-image data while improving the depth of interpretability by data-driven ontology. However, the DI-VNN is only for exploration beyond its predictive ability, which requires further explanatory studies, and needs a human user with specific competences in medicine, statistics, and machine learning to explore the DI-VNN with high confidence. The key stages consisted of data preprocessing, differential analysis, feature mapping, network architecture construction, model training and validation, and exploratory analysis.
We aimed to provide a resampling protocol for dimensional reduction resulting a few latent variables. The applicability focuses on but not limited for developing a machine learning prediction model in order to improve the number of sample size in relative to the number of candidate predictors. By this feature representation technique, one can improve generalization by preventing latent variables to overfit data used to conduct the dimensional reduction. However, this technique may warrant more computational capacity and time to conduct the procedure. The key stages consisted of derivation of latent variables from multiple resampling subsets, parameter estimation of latent variables in population, and selection of latent variables transformed by the estimated parameters.
Prognostic prediction of prelabor rupture of membrane (PROM) lacks of sample size and external validation. We compared a statistical model, machine learning algorithms, and a deep-insight visible neural network (DI-VNN) for PROM and estimating the time of delivery. We selected visits, including PROM (n=23,791/170,730), retrospectively from a nationwide health insurance dataset. DI-VNN achieved the best prediction (area under receiver operating characteristics curve [AUROC] 0.73, 95% CI 0.72 to 0.75). Meanwhile, random forest using principal components achieved the best estimation with root mean squared errors ± 2.2 and 2.6 weeks respectively for the predicted event and nonevent. DI-VNN outperformed previous models by an external validation set, including one using a biomarker (AUROC 0.641; n=1,177). We deployed our models as a web application requiring diagnosis/procedure codes and dates. In conclusion, our models may be used solely in low-resource settings or as a preliminary model to reduce a specific test requiring high-resource setting.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers