Organ volume measurements are a key metric for managing ADPKD (the most common inherited renal disease). However, measuring organ volumes is tedious and involves manually contouring organ outlines on multiple cross-sectional MRI or CT images. The automation of kidney contouring using deep learning has been proposed, as it has small errors compared to manual contouring. Here, a deployed open-source deep learning ADPKD kidney segmentation pipeline is extended to also measure liver and spleen volumes, which are also important. This 2D U-net deep learning approach was developed with radiologist labeled T2-weighted images from 215 ADPKD subjects (70% training = 151, 30% validation = 64). Additional ADPKD subjects were utilized for prospective (n = 30) and external (n = 30) validations for a total of 275 subjects. Image cropping previously optimized for kidneys was included in training but removed for the validation and inference to accommodate the liver which is closer to the image border. An effective algorithm was developed to adjudicate overlap voxels that are labeled as more than one organ. Left kidney, right kidney, liver and spleen labels had average errors of 3%, 7%, 3%, and 1%, respectively, on external validation and 5%, 6%, 5%, and 1% on prospective validation. Dice scores also showed that the deep learning model was close to the radiologist contouring, measuring 0.98, 0.96, 0.97 and 0.96 on external validation and 0.96, 0.96, 0.96 and 0.95 on prospective validation for left kidney, right kidney, liver and spleen, respectively. The time required for manual correction of deep learning segmentation errors was only 19:17 min compared to 33:04 min for manual segmentations, a 42% time saving (p = 0.004). Standard deviation of model assisted segmentations was reduced to 7, 5, 11, 5 mL for right kidney, left kidney, liver and spleen respectively from 14, 10, 55 and 14 mL for manual segmentations. Thus, deep learning reduces the radiologist time required to perform multiorgan segmentations in ADPKD and reduces measurement variability.
BackgroundTotal kidney volume (TKV) is an important biomarker for assessing kidney function, especially for autosomal dominant polycystic kidney disease (ADPKD). However, TKV measurements from a single MRI pulse sequence have limited reproducibility, ± ~5%, similar to ADPKD annual kidney growth rates.PurposeTo improve TKV measurement reproducibility on MRI by extending artificial intelligence algorithms to automatically segment kidneys on T1‐weighted, T2‐weighted, and steady state free precession (SSFP) sequences in axial and coronal planes and averaging measurements.Study TypeRetrospective training, prospective testing.SubjectsThree hundred ninety‐seven patients (356 with ADPKD, 41 without), 75% for training and 25% for validation, 40 ADPKD patients for testing and 17 ADPKD patients for assessing reproducibility.Field Strength/SequenceT2‐weighted single‐shot fast spin echo (T2), SSFP, and T1‐weighted 3D spoiled gradient echo (T1) at 1.5 and 3T.Assessment2D U‐net segmentation algorithm was trained on images from all sequences. Five observers independently measured each kidney volume manually on axial T2 and using model‐assisted segmentations on all sequences and image plane orientations for two MRI exams in two sessions separated by 1–3 weeks to assess reproducibility. Manual and model‐assisted segmentation times were recorded.Statistical TestsBland–Altman, Schapiro–Wilk (normality assessment), Pearson's chi‐squared (categorical variables); Dice similarity coefficient, interclass correlation coefficient, and concordance correlation coefficient for analyzing TKV reproducibility. P‐value < 0.05 was considered statistically significant.ResultsIn 17 ADPKD subjects, model‐assisted segmentations of axial T2 images were significantly faster than manual segmentations (2:49 minute vs. 11:34 minute), with no significant absolute percent difference in TKV (5.9% vs. 5.3%, P = 0.88) between scans 1 and 2. Absolute percent differences between the two scans for model‐assisted segmentations on other sequences were 5.5% (axial T1), 4.5% (axial SSFP), 4.1% (coronal SSFP), and 3.2% (coronal T2). Averaging measurements from all five model‐assisted segmentations significantly reduced absolute percent difference to 2.5%, further improving to 2.1% after excluding an outlier.Data ConclusionMeasuring TKV on multiple MRI pulse sequences in coronal and axial planes is practical with deep learning model‐assisted segmentations and can improve TKV measurement reproducibility more than 2‐fold in ADPKD.Evidence Level2Technical EfficacyStage 1
Total kidney volume measured on MRI is an important biomarker for assessing the progression of autosomal dominant polycystic kidney disease and response to treatment. However, we have noticed that there can be substantial differences in the kidney volume measurements obtained from the various pulse sequences commonly included in an MRI exam. Here we examine kidney volume measurement variability among five commonly acquired MRI pulse sequences in abdominal MRI exams in 105 patients with ADPKD. Right and left kidney volumes were independently measured by three expert observers using model-assisted segmentation for axial T2, coronal T2, axial single-shot fast spin echo (SSFP), coronal SSFP, and axial 3D T1 images obtained on a single MRI from ADPKD patients. Outlier measurements were analyzed for data acquisition errors. Most of the outlier values (88%) were due to breathing during scanning causing slice misregistration with gaps or duplication of imaging slices (n = 35), slice misregistration from using multiple breath holds during acquisition (n = 25), composing of two overlapping acquisitions (n = 17), or kidneys not entirely within the field of view (n = 4). After excluding outlier measurements, the coefficient of variation among the five measurements decreased from 4.6% pre to 3.2%. Compared to the average of all sequences without errors, TKV measured on axial and coronal T2 weighted imaging were 1.2% and 1.8% greater, axial SSFP was 0.4% greater, coronal SSFP was 1.7% lower and axial T1 was 1.5% lower than the mean, indicating intrinsic measurement biases related to the different MRI contrast mechanisms. In conclusion, MRI data acquisition errors are common but can be identified using outlier analysis and excluded to improve organ volume measurement consistency. Bias toward larger volume measurements on T2 sequences and smaller volumes on axial T1 sequences can also be mitigated by averaging data from all error-free sequences acquired.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.