High throughput nanoindentation techniques can provide rapid materials screening and property mapping and can span millimeter length scales and up to 106 data points. To facilitate rapid sorting of these data into similar groups, a necessary task for establishing structure–property relationships, use of an unsupervised machine learning analysis called clustering has grown in popularity. Here, a method is proposed and tested that evaluates the uncertainty associated with various clustering algorithms for an example high entropy alloy data set and explores the effect of the number of data points in a second Damascus steel data set. The proposed method utilizes the bootstrapping method of Efron to resample a modeled probability distribution function based upon the original data, which allows the uncertainty related to the clustering to be evaluated in contrast to the classical standard error on the mean calculations. For the Damascus, it was found that results data from a 104 point subsample are comparable to those from the full 106 set while representing a significant reduction in data acquisition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.