An important need exists for strategies to perform rigorous objective clinical-task-based evaluation of artificial intelligence (AI) algorithms for nuclear medicine. To address this need, we propose a 4-class framework to evaluate AI algorithms for promise, technical task-specific efficacy, clinical decision making, and postdeployment efficacy. We provide best practices to evaluate AI algorithms for each of these classes. Each class of evaluation yields a claim that provides a descriptive performance of the AI algorithm. Key best practices are tabulated as the RELAINCE (Recommendations for EvaLuation of AI for NuClear medicinE) guidelines. The report was prepared by the Society of Nuclear Medicine and Molecular Imaging AI Task Force Evaluation team, which consisted of nuclear-medicine physicians, physicists, computational imaging scientists, and representatives from industry and regulatory agencies.
Attenuation compensation (AC) is a pre-requisite for reliable quantification and beneficial for visual interpretation tasks in single-photon emission computed tomography (SPECT). Typical AC methods require the availability of an attenuation map, which is obtained using a transmission scan, such as a CT scan. This has several disadvantages such as increased radiation dose, higher costs, and possible misalignment between SPECT and CT scans. Also, often a CT scan is unavailable. In this context, we and others are showing that scattered photons in SPECT contain information to estimate the attenuation distribution. To exploit this observation, we propose a physics and learning-based method that uses the SPECT emission data in the photopeak and scatter windows to perform transmission-less AC in SPECT. The proposed method uses data acquired in the scatter window to reconstruct an initial estimate of the attenuation map using a physicsbased approach. A convolutional neural network is then trained to segment this initial estimate into different regions. Predefined attenuation coefficients are assigned to these regions, yielding the reconstructed attenuation map, which is then used to reconstruct the activity distribution using an ordered subsets expectation maximization (OSEM)-based reconstruction approach. We objectively evaluated the performance of this method using highly realistic simulation studies conducted on the clinically relevant task of detecting perfusion defects in myocardial perfusion SPECT. Our results showed no statistically significant differences between the performance achieved using the proposed method and that with the true attenuation maps. Visually, the images reconstructed using the proposed method looked similar to those with the true attenuation map. Overall, these results provide evidence of the capability of the proposed method to perform transmissionless AC and motivate further evaluation.
Multiple objective assessment of image-quality (OAIQ)-based studies have reported that several deep-learning (DL)-based denoising methods show limited performance on signal-detection tasks. Our goal was to investigate the reasons for this limited performance. To achieve this goal, we conducted a task-based characterization of a DL-based denoising approach for individual signal properties. We conducted this study in the context of evaluating a DL-based approach for denoising single photon-emission computed tomography (SPECT) images.The training data consisted of signals of different sizes and shapes within a clustered-lumpy background, imaged with a 2D parallel-hole-collimator SPECT system. The projections were generated at normal and 20% low-count level, both of which were reconstructed using an ordered-subset-expectation-maximization (OSEM) algorithm. A convolutional neural network (CNN)-based denoiser was trained to process the low-count images. The performance of this CNN was characterized for five different signal sizes and four different signal-tobackground ratio (SBRs) by designing each evaluation as a signal-known-exactly/background-known-statistically (SKE/BKS) signal-detection task. Performance on this task was evaluated using an anthropomorphic channelized Hotelling observer (CHO). As in previous studies, we observed that the DL-based denoising method did not improve performance on signal-detection tasks. Evaluation using the idea of observer-study-based characterization demonstrated that the DL-based denoising approach did not improve performance on the signal-detection task for any of the signal types. Overall, these results provide new insights on the performance of the DL-based denoising approach as a function of signal size and contrast. More generally, the observer study-based characterization provides a mechanism to evaluate the sensitivity of the method to specific object properties, and may be explored as analogous to characterizations such as modulation transfer function for linear systems. Finally, this work underscores the need for objective task-based evaluation of DL-based denoising approaches.
Deep-learning (DL)-based methods have shown significant promise in denoising myocardial perfusion SPECT images acquired at low dose. For clinical application of these methods, evaluation on clinical tasks is crucial. Typically, these methods are designed to minimize some fidelity-based criterion between the predicted denoised image and some reference normal-dose image. However, while promising, studies have shown that these methods may have limited impact on the performance of clinical tasks in SPECT. To address this issue, we use concepts from the literature on model observers and our understanding of the human visual system to propose a DL-based denoising approach designed to preserve observer-related information for detection tasks. The proposed method was objectively evaluated on the task of detecting perfusion defect in myocardial perfusion SPECT images using a retrospective study with anonymized clinical data. Our results demonstrate that the proposed method yields improved performance on this detection task compared to using low-dose images. The results show that by preserving task-specific information, DL may provide a mechanism to improve observer performance in low-dose myocardial perfusion SPECT.
Attenuation compensation (AC) is beneficial for visual interpretation tasks in single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI). However, traditional AC methods require the availability of a transmission scan, most often a CT scan. This approach has the disadvantage of increased radiation dose, increased scanner costs, and the possibility of inaccurate diagnosis in cases of misregistration between the SPECT and CT images. Further, many SPECT systems do not include a CT component. To address these issues, we developed a Scatter-window projection and deep Learning-based AC (SLAC) method to perform AC without a separate transmission scan. To investigate the clinical efficacy of this method, we then objectively evaluated the performance of this method on the clinical task of detecting perfusion defects on MPI in a retrospective study with anonymized clinical SPECT/CT stress MPI images. The proposed method was compared with CT-based AC (CTAC) and no-AC (NAC) methods. Our results showed that the SLAC method yielded an almost overlapping receiver operating characteristic (ROC) plot and a similar area under the ROC (AUC) to the CTAC method on this task. These results demonstrate the capability of the SLAC method for transmission-less AC in SPECT and motivate further clinical evaluation.
Background Single-photon emission computed tomography (SPECT) provides a mechanism to perform absorbed-dose quantification tasks for $$\alpha$$ α -particle radiopharmaceutical therapies ($$\alpha$$ α -RPTs). However, quantitative SPECT for $$\alpha$$ α -RPT is challenging due to the low number of detected counts, the complex emission spectrum, and other image-degrading artifacts. Towards addressing these challenges, we propose a low-count quantitative SPECT reconstruction method for isotopes with multiple emission peaks. Methods Given the low-count setting, it is important that the reconstruction method extracts the maximal possible information from each detected photon. Processing data over multiple energy windows and in list-mode (LM) format provide mechanisms to achieve that objective. Towards this goal, we propose a list-mode multi energy window (LM-MEW) ordered-subsets expectation–maximization-based SPECT reconstruction method that uses data from multiple energy windows in LM format and include the energy attribute of each detected photon. For computational efficiency, we developed a multi-GPU-based implementation of this method. The method was evaluated using 2-D SPECT simulation studies in a single-scatter setting conducted in the context of imaging [$$^{223}$$ 223 Ra]RaCl$${_2}$$ 2 , an FDA-approved RPT for metastatic prostate cancer. Results The proposed method yielded improved performance on the task of estimating activity uptake within known regions of interest in comparison to approaches that use a single energy window or use binned data. The improved performance was observed in terms of both accuracy and precision and for different sizes of the region of interest. Conclusions Results of our studies show that the use of multiple energy windows and processing data in LM format with the proposed LM-MEW method led to improved quantification performance in low-count SPECT of isotopes with multiple emission peaks. These results motivate further development and validation of the LM-MEW method for such imaging applications, including for $$\alpha$$ α -RPT SPECT.
Synthetic images generated by simulation studies have a well-recognized role in developing and evaluating imaging systems and methods. For clinically relevant development and evaluation, synthetic images must be clinically realistic and, ideally, have the same distribution as that of clinical images. Thus, mechanisms that can quantitatively evaluate this clinical realism and, ideally, similarity in distributions of real and synthetic images, are much needed.

We investigated two observer-study-based approaches to quantitatively evaluate the clinical realism of synthetic images. First, we presented a theoretical formalism for using an ideal observer to quantitatively evaluate similarity in distributions between real and synthetic images. Our theoretical formalism provides a direct relationship between the ideal-observer AUC and distributions of real and synthetic images. The second approach is based on using human observers to quantitatively evaluate the clinical realism. We developed a web-based software to conduct two-alternative forced-choice (2-AFC) experiments with expert human readers. Usability of this software was evaluated by conducting a system usability scale (SUS) survey with seven expert readers and five observer-study designers. Further, we demonstrated the application of this software to evaluate a stochastic and physics-based image-synthesis technique for oncologic PET, where the 2-AFC study was performed by six expert readers who were highly experienced in reading PET scans.

In the first approach, we theoretically demonstrated that the ideal-observer AUC can be expressed by the Bhattacharyya distance between distributions of real and synthetic images. We showed that a decrease in the ideal-observer AUC indicates a decrease in distance between the two image distributions. Moreover, a lower bound of AUC = 0.5 implies that distributions of synthetic and real images exactly match. In the second approach, results from the SUS survey demonstrate that our developed software is highly usable. As a secondary finding, evaluation of the PET image-synthesis technique using our software showed that expert readers were generally unable to distinguish the real and synthetic images.

This work addresses the important need for mechanisms to quantitatively evaluate the clinical realism of synthetic images. Our mathematical treatment shows that quantifying the similarity in distributions of real and synthetic images is theoretically possible with an ideal-observer-study-based approach. Our developed software provides a platform for designing and performing 2-AFC experiments with human observers in a highly accessible, efficient, and secure manner. Additionally, results on evaluation of the PET image-synthesis technique motivate the application of this technique to develop and evaluate a wide array of PET imaging methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.