The availability of medical imaging data from clinical archives, research literature, and clinical manuals, coupled with recent advances in computer vision offer the opportunity for image-based diagnosis, teaching, and biomedical research. However, the content and semantics of an image can vary depending on its modality and as such the identification of image modality is an important preliminary step. The key challenge for automatically classifying the modality of a medical image is due to the visual characteristics of different modalities: some are visually distinct while others may have only subtle differences. This challenge is compounded by variations in the appearance of images based on the diseases depicted and a lack of sufficient training data for some modalities. In this paper, we introduce a new method for classifying medical images that uses an ensemble of different convolutional neural network (CNN) architectures. CNNs are a state-of-the-art image classification technique that learns the optimal image features for a given classification task. We hypothesise that different CNN architectures learn different levels of semantic image representation and thus an ensemble of CNNs will enable higher quality features to be extracted. Our method develops a new feature extractor by fine-tuning CNNs that have been initialized on a large dataset of natural images. The fine-tuning process leverages the generic image features from natural images that are fundamental for all images and optimizes them for the variety of medical imaging modalities. These features are used to train numerous multiclass classifiers whose posterior probabilities are fused to predict the modalities of unseen images. Our experiments on the ImageCLEF 2016 medical image public dataset (30 modalities; 6776 training images, and 4166 test images) show that our ensemble of fine-tuned CNNs achieves a higher accuracy than established CNNs. Our ensemble also achieves a higher accuracy than methods in the literature evaluated on the same benchmark dataset and is only overtaken by those methods that source additional training data.
Our extensive experimental results on two well-established public benchmark datasets demonstrate that our method is more effective than other state-of-the-art methods for skin lesion segmentation.
Medical imaging is fundamental to modern healthcare, and its widespread use has resulted in the creation of image databases, as well as picture archiving and communication systems. These repositories now contain images from a diverse range of modalities, multidimensional (three-dimensional or time-varying) images, as well as co-aligned multimodality images. These image collections offer the opportunity for evidence-based diagnosis, teaching, and research; for these applications, there is a requirement for appropriate methods to search the collections for images that have characteristics similar to the case(s) of interest. Content-based image retrieval (CBIR) is an image search technique that complements the conventional text-based retrieval of images by using visual features, such as color, texture, and shape, as search criteria. Medical CBIR is an established field of study that is beginning to realize promise when applied to multidimensional and multimodality medical data. In this paper, we present a review of state-of-the-art medical CBIR approaches in five main categories: two-dimensional image retrieval, retrieval of images with three or more dimensions, the use of nonimage data to enhance the retrieval, multimodality image retrieval, and retrieval from diverse datasets. We use these categories as a framework for discussing the state of the art, focusing on the characteristics and modalities of the information used during medical image retrieval.
The analysis of multi-modality positron emission tomography and computed tomography (PET-CT) images requires combining the sensitivity of PET to detect abnormal regions with anatomical localization from CT. However, current methods for PET-CT image analysis either process the modalities separately or fuse information from each modality based on knowledge about the image analysis task. These methods generally do not consider the spatially varying visual characteristics that encode different information across the different modalities, which have different priorities at different locations. For example, a high abnormal PET uptake in the lungs is more meaningful for tumor detection than physiological PET uptake in the heart. Our aim is to improve fusion of the complementary information in multi-modality PET-CT with a new supervised convolutional neural network (CNN) that learns to fuse complementary information for multi-modality medical image analysis. Our CNN first encodes modality-specific features and then uses them to derive a spatially varying fusion map that quantifies the relative importance of each modality's features across different spatial locations. These fusion maps are then multiplied with the modality-specific feature maps to obtain a representation of the complementary multi-modality information at different locations, which can then be used for image analysis, e.g. region detection. We evaluated our CNN on a region detection problem using a dataset of PET-CT images of lung cancer. We compared our method to baseline techniques for multi-modality image analysis (pre-fused inputs, multi-branch techniques, multichannel techniques) and demonstrated that our approach had a significantly higher accuracy (p < 0.05) than the baselines.
The segmentation of skin lesions in dermoscopic images is a fundamental step in automated computer-aided diagnosis of melanoma. Conventional segmentation methods, however, have difficulties when the lesion borders are indistinct and when contrast between the lesion and the surrounding skin is low. They also perform poorly when there is a heterogeneous background or a lesion that touches the image boundaries; this then results in under- and oversegmentation of the skin lesion. We suggest that saliency detection using the reconstruction errors derived from a sparse representation model coupled with a novel background detection can more accurately discriminate the lesion from surrounding regions. We further propose a Bayesian framework that better delineates the shape and boundaries of the lesion. We also evaluated our approach on two public datasets comprising 1100 dermoscopic images and compared it to other conventional and state-of-the-art unsupervised (i.e., no training required) lesion segmentation methods, as well as the state-of-the-art unsupervised saliency detection methods. Our results show that our approach is more accurate and robust in segmenting lesions compared to other methods. We also discuss the general extension of our framework as a saliency optimization algorithm for lesion segmentation.
The segmentation of abnormal regions on dermoscopic images is an important step for automated computer aided diagnosis (CAD) of skin lesions. Recent methods based on fully convolutional networks (FCN) have been very successful for dermoscopic image segmentation. However, they tend to overfit to the visual characteristics that are present in the dominant non-melanoma studies and therefore, perform poorly on the complex visual characteristics exhibited by melanoma studies, which usually consists of fuzzy boundaries and heterogeneous textures. In this paper, we propose a new method for automated skin lesion segmentation that overcomes these limitations via a novel deep class-specific learning approach which learns the important visual characteristics of the skin lesions of each individual class (melanoma vs non-melanoma) on an individual basis. We also introduce a new probability-based, step-wise integration to combine complementary segmentation results derived from individual class-specific learning models. We achieved an average Dice coefficient of 85.66% on the ISBI 2017 Skin Lesion Challenge (SLC), 91.77% on the ISBI 2016 SLC and 92.10% on the PH2 datasets with corresponding Jaccard indices of 77.73%, 85.92% and 85.90%, respectively, for the same datasets. Our
Abstract. Positron emission tomography (PET) imaging is widely used for staging and monitoring treatment in a variety of cancers including the lymphomas and lung cancer. Recently, there has been a marked increase in the accuracy and robustness of machine learning methods and their application to computer-aided diagnosis (CAD) systems, e.g., the automated detection and quantification of abnormalities in medical images. Successful machine learning methods require large amounts of training data and hence, synthesis of PET images could play an important role in enhancing training data and ultimately improve the accuracy of PETbased CAD systems. Existing approaches such as atlas-based or methods that are based on simulated or physical phantoms have problems in synthesizing the low resolution and low signal-to-noise ratios inherent in PET images. In addition, these methods usually have limited capacity to produce a variety of synthetic PET images with large anatomical and functional differences. Hence, we propose a new method to synthesize PET data via multi-channel generative adversarial networks (M-GAN) to address these limitations. Our M-GAN approach, in contrast to the existing medical image synthetic methods that rely on using low-level features, has the ability to capture feature representations with a highlevel of semantic information based on the adversarial learning concept. Our M-GAN is also able to take the input from the annotation (label) to synthesize regions of high uptake e.g., tumors and from the computed tomography (CT) images to constrain the appearance consistency based on the CT derived anatomical information in a single framework and output the synthetic PET images directly. Our experimental data from 50 lung cancer PET-CT studies show that our method provides more realistic PET images compared to conventional GAN methods. Further, the PET tumor detection model, trained with our synthetic PET data, performed competitively when compared to the detection model trained with real PET data (2.79% lower in terms of recall). We suggest that our approach when used in combination with real and synthetic images, boosts the training data for machine learning methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.