A consistent challenge for both new and expert practitioners of small‐angle scattering (SAS) lies in determining how to analyze the data, given the limited information content of said data and the large number of models that can be employed. Machine learning (ML) methods are powerful tools for classifying data that have found diverse applications in many fields of science. Here, ML methods are applied to the problem of classifying SAS data for the most appropriate model to use for data analysis. The approach employed is built around the method of weighted k nearest neighbors (wKNN), and utilizes a subset of the models implemented in the SasView package (https://www.sasview.org/) for generating a well defined set of training and testing data. The prediction rate of the wKNN method implemented here using a subset of SasView models is reasonably good for many of the models, but has difficulty with others, notably those based on spherical structures. A novel expansion of the wKNN method was also developed, which uses Gaussian processes to produce local surrogate models for the classification, and this significantly improves the classification accuracy. Further, by integrating a stochastic gradient descent method during post‐processing, it is possible to leverage the local surrogate model both to classify the SAS data with high accuracy and to predict the structural parameters that best describe the data. The linking of data classification and model fitting has the potential to facilitate the translation of measured data into results for both novice and expert practitioners of SAS.
While a large number of deep learning networks have been studied and published that produce outstanding results on natural image datasets, these datasets only make up a fraction of those to which deep learning can be applied. These datasets include text data, audio data, and arrays of sensors that have very different characteristics than natural images. As these "best" networks for natural images have been largely discovered through experimentation and cannot be proven optimal on some theoretical basis, there is no reason to believe that they are the optimal network for these drastically different datasets. Hyperparameter search is thus often a very important process when applying deep learning to a new problem. In this work we present an evolutionary approach to searching the possible space of network hyperparameters and construction that can scale to 18, 000 nodes. This approach is applied to datasets of varying types and characteristics where we demonstrate the ability to rapidly find best hyperparameters in order to enable practitioners to quickly iterate between idea and result.
Let 2 [n] denote the power set of [n], where [n] = {1, 2, . . . , n}. A collection B ⊂ 2 [n] forms a d-dimensional Boolean algebra if there exist pairwise disjoint sets X 0 , X 1 , . . . , X d ⊆ [n], all non-empty with perhaps the exception of. Let b(n, d) be the maximum cardinality of a family F ⊂ 2 X that does not contain a d-dimensional Boolean algebra. Gunderson, Rödl, andIn this paper, we use the Lubell function as a new measurement for large families instead of cardinality. The Lubell value of a family of sets F with F ⊆ 2 [n] is defined by h n (F ) = F ∈F 1/ n |F | . We prove the following Turán type theorem. If F ⊆ 2 [n] contains no d-dimensional Boolean algebra, then h n (F ) ≤ 2(n + 1) 1−2 1−d for sufficiently large n. This result implies b(n, d) ≤ Cn −1/2 d · 2 n , where C is an absolute constant independent of n and d. With some modification, the ideas in Gunderson, Rödl, and Sidorenko's proof can be used to obtain this result. We apply the new bound on b(n, d) to improve several Ramsey-type bounds on Boolean algebras. We also prove a canonical Ramsey theorem for Boolean algebras.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.