The ability to automatically learn task specific feature representations has led to a huge success of deep learning methods. When large training data is scarce, such as in medical imaging problems, transfer learning has been very effective. In this paper, we systematically investigate the process of transferring a Convolutional Neural Network, trained on ImageNet images to perform image classification, to kidney detection problem in ultrasound images. We study how the detection performance depends on the extent of transfer. We show that a transferred and tuned CNN can outperform a state-of-the-art feature engineered pipeline and a hybridization of these two techniques achieves 20% higher performance. We also investigate how the evolution of intermediate response images from our network. Finally, we compare these responses to state-of-the-art image processing filters in order to gain greater insight into how transfer learning is able to effectively manage widely varying imaging regimes.
Several computational visual saliency models have been proposed in the context of viewing natural scenes. We aim to investigate the relevance of computational saliency models in medical images in the context of abnormality detection. We report on two studies aimed at understanding the role of visual saliency in medical images. Diffuse lesions in Chest X-Ray images, which are characteristic of Pneumoconiosis and high contrast lesions such as 'Hard Exudates' in retinal images were chosen for the study. These approximately correspond to conjunctive and disjunctive targets in a visual search task. Saliency maps were computed using three popular models namely Itti-Koch [7], GBVS [3] and SR [4]. The obtained maps were evaluated against gaze maps and ground truth from medical experts.Our results show that GBVS is seen to perform the best (Mdn. ROC area = 0.77) for chest X-Ray images while SR performs the best (ROC area = 0.73) for retinal images, thus asserting that searching for conjunctive targets calls for a more local examination of an image while disjunctive targets call for a global examination. Based on the results of the above study, we propose extensions for the two best performing models. The first extension makes use of top down knowledge such as lung segmentation. This is shown to improve the performance of GBVS to some extent. In the second case the extension is by way of including multiscale information. This is shown to significantly (by 28.76%) improve abnormality detection. The key insight from these studies is that bottom saliency continues to play a predominant role in examining medical images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.