2016
DOI: 10.1002/sia.6042
|View full text |Cite
|
Sign up to set email alerts
|

Rapid multivariate analysis of 3D ToF‐SIMS data: graphical processor units (GPUs) and low‐discrepancy subsampling for large‐scale principal component analysis

Abstract: Principal component analysis (PCA) and other multivariate analysis methods have been used increasingly to analyse and understand depth-profiles in XPS, AES and SIMS. For large images or three-dimensional (3D) imaging depth-profiles, PCA has been difficult to apply until now simply because of the size of the matrices of data involved. In a recent paper, we described two algorithms, random vector 1 (RV1) and random vector 2 (RV2), that improve the speed of PCA and allow datasets of unlimited size, respectively. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 19 publications
(10 citation statements)
references
References 26 publications
(35 reference statements)
0
10
0
Order By: Relevance
“…The tests involved different ways of data normalization; different data preprocessing, other than Poisson scaling; the use of a variety of NMF algorithms, such as multiplicative update rules or alternating least-squares; removal of the standard samples from the dataset (or only some of them) and playing with different levels of pixel subsampling. The chosen data analysis workflow was as follows: Prior to any MVA, two preprocessing steps are performed: normalization of all maps intensities by total counts per pixel and Poisson scaling of the peak intensities. Selection of a subset of pixels using low discrepancy subsampling as proposed in ref . The computer used was a Dell Optiplex 9020 PC with a core i5 processor and 32 GB of RAM.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The tests involved different ways of data normalization; different data preprocessing, other than Poisson scaling; the use of a variety of NMF algorithms, such as multiplicative update rules or alternating least-squares; removal of the standard samples from the dataset (or only some of them) and playing with different levels of pixel subsampling. The chosen data analysis workflow was as follows: Prior to any MVA, two preprocessing steps are performed: normalization of all maps intensities by total counts per pixel and Poisson scaling of the peak intensities. Selection of a subset of pixels using low discrepancy subsampling as proposed in ref . The computer used was a Dell Optiplex 9020 PC with a core i5 processor and 32 GB of RAM.…”
Section: Discussionmentioning
confidence: 99%
“…The computer used was a Dell Optiplex 9020 PC with a core i5 processor and 32 GB of RAM. Such an amount of memory could easily handle the 393 216 × 669 elements of the dataset, but the use of a training sets enables the analysis to be done in any conventional PC, and it has been shown to be effective for multivariate analysis of ToF-SIMS imaging data, in which neighbor pixels are highly correlated. Furthermore, with reduced subsets, the total processing time is conveniently reduced to a few seconds for the dataset presented in this paper. Determination of number of NMF components by contrast analysis: NMF is performed several times for the subsampled dataset with a varying number of components. For each result, obtained with k components, the contrast G k of matrix W was calculated accordingly to the definition given by Silva et al: where w i are the columns of matrix W , as shown in Figure .…”
Section: Discussionmentioning
confidence: 99%
“…Data handling can be significantly more problematic with FT data that can contain millions of mass channels. Efficient peak picking algorithms and subsampling have been explored for hyperspectral data sets to address this problem, 278 , 279 along with software developments for image analysis. 280 However, data set sizes will remain a noteworthy issue in MSI as higher resolution imaging methods are developed.…”
Section: Challenges and Future Perspectivesmentioning
confidence: 99%
“…The loadings were calculated using a training set consisting of only 6.11% of the total number of pixels, and the IMS data was then projected on the loading vectors to create the (score) images. Cumpson et al () have similarly used a subsampling approach for the PCA analysis of large size 3D SIMS datasets, although here quasirandom Sobol sampling was used to obtain a more even spatial sampling throughout the sample. Graphical processing units (GPUs) were used to speed up the calculation of the PCs, as has also previously been demonstrated by Jones et al () for PCA, pLSA, and NMF (see Section II…”
Section: Factorizationmentioning
confidence: 99%