Classification of diffraction patterns using a convolutional neural network in single-particle-imaging experiments performed at X-ray free-electron lasers
Abstract:Single particle imaging (SPI) at X-ray free-electron lasers is particularly well suited to determining the 3D structure of particles at room temperature. For a successful reconstruction, diffraction patterns originating from a single hit must be isolated from a large number of acquired patterns. It is proposed that this task could be formulated as an image-classification problem and solved using convolutional neural network (CNN) architectures. Two CNN configurations are developed: one that maximizes the F1 sc… Show more
“…In real SPI experiments, these steps cannot be avoided and have to be performed very carefully to obtain the final particle structure with high resolution. Each of the steps mentioned above have become the study of separate research: single hit classification (Bobkov et al, 2015;Shi et al, 2019;Cruz-Chu ´et al, 2021;Ignatenko et al, 2021;Assalauova et al, 2022), orientation determination (Loh & Elser, 2009;Ayyer et al, 2016) and background subtraction (Rose et al, 2018;Kurta et al, 2017;Lundholm et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…The number of simulated diffraction patterns (1000) was based on the existing research (Poudyal et al, 2020) showing the dependence between this number, the experimental parameters and a spatial resolution of several nanometres or less. The number of simulated diffraction patterns can be also estimated from the experimental data (Rose et al, 2018;Assalauova et al, 2020Assalauova et al, , 2022. This number depends on the studied particle and experimental conditions.…”
Section: Spatial Structure Of the Tbev From Simulated Datamentioning
confidence: 99%
“…For example, Assalauova et al (2020) used a clustering algorithm based on the maximum-likelihood method, which is actively used in cryo-EM (Dempster et al, 1977). In the work by Assalauova et al (2022), machine learning methods using a convolutional neural network (CNN) were used for the same purpose. The reconstructed electron density of the virus was obtained by the modal decomposition of several reconstructions of the virus under study (Assalauova et al, 2020).…”
The study of virus structures by X-ray free-electron lasers (XFELs) has attracted increased attention in recent decades. Such experiments are based on the collection of 2D diffraction patterns measured at the detector following the application of femtosecond X-ray pulses to biological samples. To prepare an experiment at the European XFEL, the diffraction data for the tick-borne encephalitis virus (TBEV) was simulated with different parameters and the optimal values were identified. Following the necessary steps of a well established data-processing pipeline, the structure of TBEV was obtained. In the structure determination presented, a priori knowledge of the simulated virus orientations was used. The efficiency of the proposed pipeline was demonstrated.
“…In real SPI experiments, these steps cannot be avoided and have to be performed very carefully to obtain the final particle structure with high resolution. Each of the steps mentioned above have become the study of separate research: single hit classification (Bobkov et al, 2015;Shi et al, 2019;Cruz-Chu ´et al, 2021;Ignatenko et al, 2021;Assalauova et al, 2022), orientation determination (Loh & Elser, 2009;Ayyer et al, 2016) and background subtraction (Rose et al, 2018;Kurta et al, 2017;Lundholm et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…The number of simulated diffraction patterns (1000) was based on the existing research (Poudyal et al, 2020) showing the dependence between this number, the experimental parameters and a spatial resolution of several nanometres or less. The number of simulated diffraction patterns can be also estimated from the experimental data (Rose et al, 2018;Assalauova et al, 2020Assalauova et al, , 2022. This number depends on the studied particle and experimental conditions.…”
Section: Spatial Structure Of the Tbev From Simulated Datamentioning
confidence: 99%
“…For example, Assalauova et al (2020) used a clustering algorithm based on the maximum-likelihood method, which is actively used in cryo-EM (Dempster et al, 1977). In the work by Assalauova et al (2022), machine learning methods using a convolutional neural network (CNN) were used for the same purpose. The reconstructed electron density of the virus was obtained by the modal decomposition of several reconstructions of the virus under study (Assalauova et al, 2020).…”
The study of virus structures by X-ray free-electron lasers (XFELs) has attracted increased attention in recent decades. Such experiments are based on the collection of 2D diffraction patterns measured at the detector following the application of femtosecond X-ray pulses to biological samples. To prepare an experiment at the European XFEL, the diffraction data for the tick-borne encephalitis virus (TBEV) was simulated with different parameters and the optimal values were identified. Following the necessary steps of a well established data-processing pipeline, the structure of TBEV was obtained. In the structure determination presented, a priori knowledge of the simulated virus orientations was used. The efficiency of the proposed pipeline was demonstrated.
“…Famous examples include recognizing handwritten digits and letters or identifying human faces. In photon science this is also a common use case, and recently we have, for example, seen it being used for femtosecond X-ray imaging patterns (FXI) (Assalauova et al, 2022), X-ray photon correlation spectroscopy (Timmermann et al, 2022) and serial femtosecond crystallography (Rahmani et al, 2023). This is often a convenient way to speed up a researcher's work by automating an otherwise labor-intensive task, by classifying a small set of patterns by hand and then training a machine-learning algorithm to classify a bigger dataset in a similar fashion.…”
Section: Introduction To the Virtual Collection Of Papers On Artifici...mentioning
“…Methods based on machine learning (ML) are ideal for automation of repetitive tasks and identi cation of patterns in data sets, and several applications to data collected at x-ray facilities have been recently published (see, e.g., Ref. 8,9,10 ). When considering 1D spectral data, numerous classi cation approaches have been developed, including unsupervised clustering method such as spectral clustering 11 , K-Means 12 , Agglomerative clustering 13 , DBSCAN 14 , and supervised ML methods such as k-nearest neighbors 15 , partial least squares discriminant analysis 16,17 , decision trees 18 , random forests 19 , and extreme learning machines 20,21 .…”
The ability to detect interesting events is instrumental to effectively steer experiments and maximize their scientific efficiency. To address this, here we introduce and validate three frameworks based on self-supervised learning which are capable of classifying 1D spectral data using a limited amount of labeled data. In particular, in this work we focus on the identification of phase transitions in samples investigated by x-ray diffraction. We demonstrate that the three frameworks, based either on relational reasoning, contrastive learning, or a combination of the two, are capable of accurately identifying phase transitions. Furthermore, we discuss in detail the selection of data augmentations, crucial to ensure that scientifically meaningful information is retained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.