2016
DOI: 10.1021/acscombsci.6b00142
|View full text |Cite
|
Sign up to set email alerts
|

Using Similarity Metrics to Quantify Differences in High-Throughput Data Sets: Application to X-ray Diffraction Patterns

Abstract: The objective of this research is to demonstrate how similarity metrics can be used to quantify differences between sets of diffraction patterns. A set of 49 similarity metrics is implemented to analyze and quantify similarities between different Gaussian-based peak responses, as a surrogate for different characteristics in X-ray diffraction (XRD) patterns. A methodological approach was used to identify and demonstrate how sensitive these metrics are to expected peak features. By performing hierarchical cluste… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 21 publications
(17 citation statements)
references
References 37 publications
0
17
0
Order By: Relevance
“…Only bin-to-bin similarity metrics are used in the ELSIE algorithm development as they are less computationally demanding for high-throughput datasets. 45 Four commonly used similarity metrics in the literatures are used in the ELSIE algorithm:…”
Section: Feffmentioning
confidence: 99%
See 1 more Smart Citation
“…Only bin-to-bin similarity metrics are used in the ELSIE algorithm development as they are less computationally demanding for high-throughput datasets. 45 Four commonly used similarity metrics in the literatures are used in the ELSIE algorithm:…”
Section: Feffmentioning
confidence: 99%
“…45 3. Cosine similarity: The cosine similarity measure is the normalized inner product and measures the angle between two spectral vectors.…”
Section: Feffmentioning
confidence: 99%
“…3 (the upper left inset contour plot) and grouped by various clusters (large contour plot). The clustering in the large contour plot was performed using Euclidean distances (L 2 norm) between rows/columns and then clustering based on maximum distances; there are a number of different distance (or similarity) functions that can be applied to cluster these vectors [59]. The diagonal for this plot is each response compared against itself (i.e., R ¼ 1).…”
Section: Discussionmentioning
confidence: 99%
“…26,27 The selection of appropriate measures is one of the most important steps when applying unsupervised learning methods to evaluate and analyse materials. [26][27][28][29] The estimation of materials parameters from experimental data via similarity measures has great potential for automated data analysis in materials research. The most important aspect of automated materials parameter estimation is the choice of a similarity measure (kernel functions, 30 ) which is not trivial and varies with the experimental method.…”
Section: Introductionmentioning
confidence: 99%
“…In this respect, the Euclidean distance (ED) (L2 norm) and Manhattan distance (L1 norm) are widely used in many fields as similarity/distance metrics; however, these metrics may perform poorly as similarity measures between measured data, 26,27,29 and the appropriate measure of similarity is not trivial. 26,29,31 We investigated measures that are robust to noise and peak broadening and are sensitive to changes in the material parameters. We demonstrate that an important material property, such as the crystal field parameter 10Dq, can be estimated automatically and promptly by the constructed regression model based on the similarity measure.…”
Section: Introductionmentioning
confidence: 99%