Imaging mass spectrometry (IMS) is a powerful analytical technique widely used in biology, chemistry, and materials science fields that continue to expand. IMS provides a qualitative compositional analysis and spatial mapping with high chemical specificity. The spatial mapping information can be 2D or 3D depending on the analysis technique employed. Due to the combination of complex mass spectra coupled with spatial information, large high-dimensional datasets (hyperspectral) are often produced. Therefore, the use of automated computational methods for an exploratory analysis is highly beneficial. The fast-paced development of artificial intelligence (AI) and machine learning (ML) tools has received significant attention in recent years. These tools, in principle, can enable the unification of data collection and analysis into a single pipeline to make sampling and analysis decisions on the go. There are various ML approaches that have been applied to IMS data over the last decade. In this review, we discuss recent examples of the common unsupervised (principal component analysis, non-negative matrix factorization, k-means clustering, uniform manifold approximation and projection), supervised (random forest, logistic regression, XGboost, support vector machine), and other methods applied to various IMS datasets in the past five years. The information from this review will be useful for specialists from both IMS and ML fields since it summarizes current and representative studies of computational ML-based exploratory methods for IMS.
Deep learning models have received much attention lately for their ability to achieve expert-level performance on the accurate automated analysis of chest X-rays. Although publicly available chest X-ray datasets include high resolution images, most models are trained on reduced size images due to limitations on GPU memory and training time. As compute capability continues to advance, it will become feasible to train large convolutional neural networks on high-resolution images. This study is based on the publicly available MIMIC-CXR-JPG dataset, comprising 377,110 high resolution chest X-ray images, and provided with 14 labels to the corresponding free-text radiology reports. We find, interestingly, that tasks that require a large receptive field are better suited to downscaled input images, and we verify this qualitatively by inspecting effective receptive fields and class activation maps of trained models. Finally, we show that stacking an ensemble across resolutions outperforms each individual learner at all input resolutions while providing interpretable scale weights, suggesting that multi-scale features are crucially important to information extraction from high-resolution chest X-rays.
Prostate cancer is
one of the most common cancers globally and
is the second most common cancer in the male population in the US.
Here we develop a study based on correlating the hematoxylin and eosin
(H&E)-stained biopsy data with MALDI mass-spectrometric imaging
data of the corresponding tissue to determine the cancerous regions
and their unique chemical signatures and variations of the predicted
regions with original pathological annotations. We obtain features
from high-resolution optical micrographs of whole slide H&E stained
data through deep learning and spatially register them with mass spectrometry
imaging (MSI) data to correlate the chemical signature with the tissue
anatomy of the data. We then use the learned correlation to predict
prostate cancer from observed H&E images using trained coregistered
MSI data. This multimodal approach can predict cancerous regions with
∼80% accuracy, which indicates a correlation between optical
H&E features and chemical information found in MSI. We show that
such paired multimodal data can be used for training feature extraction
networks on H&E data which bypasses the need to acquire expensive
MSI data and eliminates the need for manual annotation saving valuable
time. Two chemical biomarkers were also found to be predicting the
ground truth cancerous regions. This study shows promise in generating
improved patient treatment trajectories by predicting prostate cancer
directly from readily available H&E-stained biopsy images aided
by coregistered MSI data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.