Medical Image Data and Datasets in the Era of Machine Learning—Whitepaper from the 2016 C-MIMI Meeting Dataset Session

Kohli, Marc D.; Summers, Ronald M.; Geis, J. Raymond

doi:10.1007/s10278-017-9976-3

Cited by 168 publications

(94 citation statements)

References 26 publications

(20 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Research studies are generally complex and the resulting data are valuable, not only to the principal investigator but to society as a whole [123,124]. Nonetheless, many researchers remain reluctant to share their data with an expert audience [125,126] beyond describing them as part of peer-reviewed publications.…”

Section: Data Sharingmentioning

confidence: 99%

Personalizing Medicine Through Hybrid Imaging and Medical Big Data Analysis

et al. 2018

View full text Add to dashboard Cite

Medical imaging has evolved from a pure visualization tool to representing a primary source of analytic approaches toward in vivo disease characterization. Hybrid imaging is an integral part of this approach, as it provides complementary visual and quantitative information in the form of morphological and functional insights into the living body. As such, non-invasive imaging modalities no longer provide images only, but data, as stated recently by pioneers in the field. Today, such information, together with other, non-imaging medical data creates highly heterogeneous data sets that underpin the concept of medical big data. While the exponential growth of medical big data challenges their processing, they inherently contain information that benefits a patient-centric personalized healthcare. Novel machine learning approaches combined with high-performance distributed cloud computing technologies help explore medical big data. Such exploration and subsequent generation of knowledge require a profound understanding of the technical challenges. These challenges increase in complexity when employing hybrid, aka dual-or even multi-modality image data as input to big data repositories. This paper provides a general insight into medical big data analysis in light of the use of hybrid imaging information. First, hybrid imaging is introduced (see further contributions to this special Research Topic), also in the context of medical big data, then the technological background of machine learning as well as state-of-the-art distributed cloud computing technologies are presented, followed by the discussion of data preservation and data sharing trends. Joint data exploration endeavors in the context of in vivo radiomics and hybrid imaging will be presented. Standardization challenges of imaging protocol, delineation, feature engineering, and machine learning evaluation will be detailed. Last, the paper will provide an outlook into the future role of hybrid imaging in view of personalized medicine, whereby a focus will be given to the derivation of prediction models as part of clinical decision support systems, to which machine learning approaches and hybrid imaging can be anchored.

show abstract

Section: Data Sharingmentioning

confidence: 99%

Personalizing Medicine Through Hybrid Imaging and Medical Big Data Analysis

et al. 2018

View full text Add to dashboard Cite

show abstract

“…Particularly in the medical domain, this might be a strong assumption for a solution, as annotated data contains strong human bias. Although there has been a huge effort in the community to mitigate this drawback by providing datasets such as ChestX-ray14, the has annotations but is far from being a definite expression of ground truth [14]. Therefore, by using supervised learning techniques one allows the labelling error and uncertainty to adversely effect the classification output of our machine learn framework.…”

mentioning

confidence: 99%

GraphX $$^\mathbf{\small NET } -$$ N E T - Chest X-Ray Classification Under Extreme Minimal Supervision

Avilés-Rivero

Papadakis

et al. 2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

The task of classifying X-ray data is a problem of both theoretical and clinical interest. Whilst supervised deep learning methods rely upon huge amounts of labelled data, the critical problem of achieving a good classification accuracy when an extremely small amount of labelled data is available has yet to be tackled. In this work, we introduce a novel semi-supervised framework for X-ray classification which is based on a graph-based optimisation model. To the best of our knowledge, this is the first method that exploits graph-based semi-supervised learning for X-ray data classification. Furthermore, we introduce a new multiclass classification functional with carefully selected class priors which allows for a smooth solution that strengthens the synergy between the limited number of labels and the huge amount of unlabelled data. We demonstrate, through a set of numerical and visual experiments, that our method produces highly competitive results on the ChestX-ray14 data set whilst drastically reducing the need for annotated data.

show abstract

“…Deep learning models such as CNN require voluminous data to train the model without overfitting. This is the biggest challenge in the biomedical images [38]. The data that is available is limited, and most of them are raw images without annotations.…”

Section: Data Augmentationmentioning

confidence: 99%

Region-Based Automated Localization of Colonoscopy and Wireless Capsule Endoscopy Polyps

Sornapudi

Meng

2019

Applied Sciences

View full text Add to dashboard Cite

The early detection of polyps could help prevent colorectal cancer. The automated detection of polyps on the colon walls could reduce the number of false negatives that occur due to manual examination errors or polyps being hidden behind folds, and could also help doctors locate polyps from screening tests such as colonoscopy and wireless capsule endoscopy. Losing polyps may result in lesions evolving badly. In this paper, we propose a modified region-based convolutional neural network (R-CNN) by generating masks around polyps detected from still frames. The locations of the polyps in the image are marked, which assists the doctors examining the polyps. The features from the polyp images are extracted using pre-trained Resnet-50 and Resnet-101 models through feature extraction and fine-tuning techniques. Various publicly available polyp datasets are analyzed with various pertained weights. It is interesting to notice that fine-tuning with balloon data (polyp-like natural images) improved the polyp detection rate. The optimum CNN models on colonoscopy datasets including CVC-ColonDB, CVC-PolypHD, and ETIS-Larib produced values (F1 score, F2 score) of (90.73, 91.27), (80.65, 79.11), and (76.43, 78.70) respectively. The best model on the wireless capsule endoscopy dataset gave a performance of (96.67, 96.10). The experimental results indicate the better localization of polyps compared to recent traditional and deep learning methods.

show abstract

Medical Image Data and Datasets in the Era of Machine Learning—Whitepaper from the 2016 C-MIMI Meeting Dataset Session

Cited by 168 publications

References 26 publications

Personalizing Medicine Through Hybrid Imaging and Medical Big Data Analysis

Personalizing Medicine Through Hybrid Imaging and Medical Big Data Analysis

GraphX $$^\mathbf{\small NET } -$$ N E T - Chest X-Ray Classification Under Extreme Minimal Supervision

Region-Based Automated Localization of Colonoscopy and Wireless Capsule Endoscopy Polyps

Contact Info

Product

Resources

About