Treatment planning plays an important role in the process of radiotherapy (RT). The quality of the treatment plan directly and significantly affects patient treatment outcomes. In the past decades, technological advances in computer and software have promoted the development of RT treatment planning systems with sophisticated dose calculation and optimization algorithms. Treatment planners now have greater flexibility in designing highly complex RT treatment plans in order to mitigate the damage to healthy tissues better while maximizing radiation dose to tumor targets. Nevertheless, treatment planning is still largely a time-inefficient and labor-intensive process in current clinical practice. Artificial intelligence, including machine learning (ML) and deep learning (DL), has been recently used to automate RT treatment planning and has gained enormous attention in the RT community due to its great promises in improving treatment planning quality and efficiency. In this article, we reviewed the historical advancement, strengths, and weaknesses of various DL-based automated RT treatment planning techniques. We have also discussed the challenges, issues, and potential research directions of DL-based automated RT treatment planning techniques.
PurposeTo investigate the role of different multi-organ omics-based prediction models for pre-treatment prediction of Adaptive Radiotherapy (ART) eligibility in patients with nasopharyngeal carcinoma (NPC).Methods and MaterialsPre-treatment contrast-enhanced computed tomographic and magnetic resonance images, radiotherapy dose and contour data of 135 NPC patients treated at Hong Kong Queen Elizabeth Hospital were retrospectively analyzed for extraction of multi-omics features, namely Radiomics (R), Morphology (M), Dosiomics (D), and Contouromics (C), from a total of eight organ structures. During model development, patient cohort was divided into a training set and a hold-out test set in a ratio of 7 to 3 via 20 iterations. Four single-omics models (R, M, D, C) and four multi-omics models (RD, RC, RM, RMDC) were developed on the training data using Ridge and Multi-Kernel Learning (MKL) algorithm, respectively, under 10-fold cross validation, and evaluated on hold-out test data using average area under the receiver-operator-characteristics curve (AUC). The best-performing single-omics model was first determined by comparing the AUC distribution across the 20 iterations among the four single-omics models using two-sided student t-test, which was then retrained using MKL algorithm for a fair comparison with the four multi-omics models.ResultsThe R model significantly outperformed all other three single-omics models (all p-value<0.0001), achieving an average AUC of 0.942 (95%CI: 0.938-0.946) and 0.918 (95%CI: 0.903-0.933) in training and hold-out test set, respectively. When trained with MKL, the R model (R_MKL) yielded an increased AUC of 0.984 (95%CI: 0.981-0.988) and 0.927 (95%CI: 0.905-0.948) in training and hold-out test set respectively, while demonstrating no significant difference as compared to all studied multi-omics models in the hold-out test sets. Intriguingly, Radiomic features accounted for the majority of the final selected features, ranging from 64% to 94%, in all the studied multi-omics models.ConclusionsAmong all the studied models, the Radiomic model was found to play a dominant role for ART eligibility in NPC patients, and Radiomic features accounted for the largest proportion of features in all the multi-omics models.
Radiomic model reliability is a central premise for its clinical translation. Presently, it is assessed using test–retest or external data, which, unfortunately, is often scarce in reality. Therefore, we aimed to develop a novel image perturbation-based method (IPBM) for the first of its kind toward building a reliable radiomic model. We first developed a radiomic prognostic model for head-and-neck cancer patients on a training (70%) and evaluated on a testing (30%) cohort using C-index. Subsequently, we applied the IPBM to CT images of both cohorts (Perturbed-Train and Perturbed-Test cohort) to generate 60 additional samples for both cohorts. Model reliability was assessed using intra-class correlation coefficient (ICC) to quantify consistency of the C-index among the 60 samples in the Perturbed-Train and Perturbed-Test cohorts. Besides, we re-trained the radiomic model using reliable RFs exclusively (ICC > 0.75) to validate the IPBM. Results showed moderate model reliability in Perturbed-Train (ICC: 0.565, 95%CI 0.518–0.615) and Perturbed-Test (ICC: 0.596, 95%CI 0.527–0.670) cohorts. An enhanced reliability of the re-trained model was observed in Perturbed-Train (ICC: 0.782, 95%CI 0.759–0.815) and Perturbed-Test (ICC: 0.825, 95%CI 0.782–0.867) cohorts, indicating validity of the IPBM. To conclude, we demonstrated capability of the IPBM toward building reliable radiomic models, providing community with a novel model reliability assessment strategy prior to prospective evaluation.
Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’ anatomy. However, the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians. Moreover, some potentially useful quantitative information in medical images, especially that which is not visible to the naked eye, is often ignored during clinical practice. In contrast, radiomics performs high-throughput feature extraction from medical images, which enables quantitative analysis of medical images and prediction of various clinical endpoints. Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis, demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine. However, radiomics remains in a developmental phase as numerous technical challenges have yet to be solved, especially in feature engineering and statistical modeling. In this review, we introduce the current utility of radiomics by summarizing research on its application in the diagnosis, prognosis, and prediction of treatment responses in patients with cancer. We focus on machine learning approaches, for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling. Furthermore, we introduce the stability, reproducibility, and interpretability of features, and the generalizability and interpretability of models. Finally, we offer possible solutions to current challenges in radiomics research.
BackgroundUsing high robust radiomic features in modeling is recommended, yet its impact on radiomic model is unclear. This study evaluated the radiomic model’s robustness and generalizability after screening out low-robust features before radiomic modeling. The results were validated with four datasets and two clinically relevant tasks.Materials and methodsA total of 1,419 head-and-neck cancer patients’ computed tomography images, gross tumor volume segmentation, and clinically relevant outcomes (distant metastasis and local-regional recurrence) were collected from four publicly available datasets. The perturbation method was implemented to simulate images, and the radiomic feature robustness was quantified using intra-class correlation of coefficient (ICC). Three radiomic models were built using all features (ICC > 0), good-robust features (ICC > 0.75), and excellent-robust features (ICC > 0.95), respectively. A filter-based feature selection and Ridge classification method were used to construct the radiomic models. Model performance was assessed with both robustness and generalizability. The robustness of the model was evaluated by the ICC, and the generalizability of the model was quantified by the train-test difference of Area Under the Receiver Operating Characteristic Curve (AUC).ResultsThe average model robustness ICC improved significantly from 0.65 to 0.78 (P< 0.0001) using good-robust features and to 0.91 (P< 0.0001) using excellent-robust features. Model generalizability also showed a substantial increase, as a closer gap between training and testing AUC was observed where the mean train-test AUC difference was reduced from 0.21 to 0.18 (P< 0.001) in good-robust features and to 0.12 (P< 0.0001) in excellent-robust features. Furthermore, good-robust features yielded the best average AUC in the unseen datasets of 0.58 (P< 0.001) over four datasets and clinical outcomes.ConclusionsIncluding robust only features in radiomic modeling significantly improves model robustness and generalizability in unseen datasets. Yet, the robustness of radiomic model has to be verified despite building with robust radiomic features, and tightly restricted feature robustness may prevent the optimal model performance in the unseen dataset as it may lower the discrimination power of the model.
Purpose: To evaluate the effectiveness of features obtained from our proposed incremental-dose-interval-based lung subregion segmentation (IDLSS) for predicting grade ≥ 2 acute radiation pneumonitis (ARP) in lung cancer patients upon intensity-modulated radiotherapy (IMRT). (1) Materials and Methods: A total of 126 non-small-cell lung cancer patients treated with IMRT were retrospectively analyzed. Five lung subregions (SRs) were generated by the intersection of the whole lung (WL) and five sub-regions receiving incremental dose intervals. A total of 4610 radiomics features (RF) from pre-treatment planning computed tomographic (CT) and 213 dosiomics features (DF) were extracted. Six feature groups, including WL-RF, WL-DF, SR-RF, SR-DF, and the combined feature sets of WL-RDF and SR-RDF, were generated. Features were selected by using a variance threshold, followed by a Student t-test. Pearson’s correlation test was applied to remove redundant features. Subsequently, Ridge regression was adopted to develop six models for ARP using the six feature groups. Thirty iterations of resampling were implemented to assess overall model performance by using the area under the Receiver-Operating-Characteristic curve (AUC), accuracy, precision, recall, and F1-score. (2) Results: The SR-RDF model achieved the best classification performance and provided significantly better predictability than the WL-RDF model in training cohort (Average AUC: 0.98 ± 0.01 vs. 0.90 ± 0.02, p < 0.001) and testing cohort (Average AUC: 0.88 ± 0.05 vs. 0.80 ± 0.04, p < 0.001). Similarly, predictability of the SR-DF model was significantly stronger than that of the WL-DF model in training cohort (Average AUC: 0.88 ± 0.03 vs. 0.70 ± 0.030, p < 0.001) and in testing cohort (Average AUC: 0.74 ± 0.08 vs. 0.65 ± 0.06, p < 0.001). By contrast, the SR-RF model significantly outperformed the WL-RF model only in the training set (Average AUC: 0.93 ± 0.02 vs. 0.85 ± 0.03, p < 0.001), but not in the testing set (Average AUC: 0.79 ± 0.05 vs. 0.77 ± 0.07, p = 0.13). (3) Conclusions: Our results demonstrated that the IDLSS method improved model performance for classifying ARP with grade ≥ 2 when using dosiomics or combined radiomics-dosiomics features.
Radiomic model reliability is a central premise for its clinical translation. Presently, it is assessed using test-retest or external data, which, unfortunately, is often scarce in reality. Therefore, we aimed to develop a novel image perturbation-based method (IPBM) for the first of its kind toward building a reliable radiomic model. We first developed a radiomic prognostic model for head-and-neck cancer patients on a training (70%) and evaluated on a testing (30%) cohort using C-index. Subsequently, we applied the IPBM to CT images of both cohorts (Perturbed-Train and Perturbed-Test cohort) to generate 60 additional samples for both cohorts. Model reliability was assessed using intra-class correlation coefficient (ICC) to quantify consistency of the C-index among the 60 samples in the Perturbed-Train and Perturbed-Test cohorts. Besides, we re-trained the radiomic model using reliable RFs exclusively (ICC>0.75) to validate the IPBM. Results showed moderate model reliability in Perturbed-Train (ICC:0.565, 95%CI:0.518-0.615) and Perturbed-Test (ICC:0.596, 95%CI:0.527-0.670) cohorts. An enhanced reliability of the re-trained model was observed in Perturbed-Train (ICC:0.782, 95%CI:0.759-0.815) and Perturbed-Test (ICC:0.825, 95%CI:0.782-0.867) cohorts, indicating validity of the IPBM. To conclude, we demonstated capability of the IPBM toward building reliable radiomic models, providing community with a novel model reliability assessment strategy prior to prospective evaluation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.