2021
DOI: 10.1016/j.elspec.2021.147094
|View full text |Cite
|
Sign up to set email alerts
|

Box plots: A simple graphical tool for visualizing overfitting in peak fitting as demonstrated with X-ray photoelectron spectroscopy data

Abstract: Peak fitting is frequently performed in X-ray photoelectron spectroscopy (XPS). However, recent reports suggest that the current quality of this peak fitting is often inadequate in the scientific literature. Various statistical methods may be advantageously applied to an XPS peak fit to help determine the quality and validity of a fit. In this paper we describe a new statistical tool, which we believe will be helpful for determining the quality of protocols for fitting XPS data. This tool, box plots of random … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 23 publications
(9 citation statements)
references
References 35 publications
(35 reference statements)
0
6
0
Order By: Relevance
“…As mentioned before, global fitting is important in chemistry , because it allows for the simultaneous fitting of multiple data sets to a single model, which can improve the accuracy and reliability of the model parameters. It is also important for complex models with many parameters, as it can help to reduce the risk of overfitting and consequently misinterpreting. In combination with the goodness of fit statistics and correlation analysis, the user can statistically investigate the reasonability of the fit; in the SI, Advanced Usage III, global fitting is utilized to empirically showcase the effect of covalency on XAS spectra.…”
Section: Resultsmentioning
confidence: 99%
“…As mentioned before, global fitting is important in chemistry , because it allows for the simultaneous fitting of multiple data sets to a single model, which can improve the accuracy and reliability of the model parameters. It is also important for complex models with many parameters, as it can help to reduce the risk of overfitting and consequently misinterpreting. In combination with the goodness of fit statistics and correlation analysis, the user can statistically investigate the reasonability of the fit; in the SI, Advanced Usage III, global fitting is utilized to empirically showcase the effect of covalency on XAS spectra.…”
Section: Resultsmentioning
confidence: 99%
“…Whether too few or too many factors are selected, the situation is suboptimal. Overfitting, where too many factors describe the data and the system is “stretched,” may result in models that fit training data too tightly, capture too much noise, and are overly complex 43,44 . Underfitting, where the model is incomplete, may omit useful information so that true underlying relationships between data points/spectra may not be identified, that is, the system has been “amputated.” In either case, the model “is incapable of (appropriately) capturing the variability of the data,” 45 conclusions may be distorted, and these solutions might be described as Procrustean.…”
Section: Resultsmentioning
confidence: 99%
“…A linear calibration curve was constructed with the median SERS intensity of the populations with superposed box plots for each detected concentration . Box plots represent the distribution within the data set and enable to visualize the data in a more integrated way . The middle line corresponds to the median value of a data set, the box comprises the 25th to 75th percentiles, and the whiskers represent 5th and 95th percentiles.…”
Section: Resultsmentioning
confidence: 99%
“…41 Box plots represent the distribution within the data set and enable to visualize the data in a more integrated way. 42 The middle line corresponds to the median value of a data set, the box comprises the 25th to 75th percentiles, and the whiskers represent 5th and 95th percentiles. The mean is also shown in Figure 4 as the small squares above the median (see also Figure S3).…”
Section: Population Distributionmentioning
confidence: 99%