2017
DOI: 10.48550/arxiv.1710.10629
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Dimensionality reduction methods for molecular simulations

Abstract: Molecular simulations produce very highdimensional data-sets with millions of data points. As analysis methods are often unable to cope with so many dimensions, it is common to use dimensionality reduction and clustering methods to reach a reduced representation of the data. Yet these methods often fail to capture the most important features necessary for the construction of a Markov model. Here we demonstrate the results of various dimensionality reduction methods on two simulation data-sets, one of protein f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(10 citation statements)
references
References 23 publications
0
10
0
Order By: Relevance
“…This has prompted the development of nonlinear approaches including variations of tICA and autoencoders. 136,195 To avoid the issues discussed so far, others have opted to skip the dimensionality reduction stage by using structural properties such as root-mean-square deviation (RMSD) 196,197 and contact maps. 195 The stage regarding data representation ends with clustering the conformational snapshots into discrete states using unsupervised ML protocols such as the k-centers and k-means methods.…”
Section: Application Of Machine Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…This has prompted the development of nonlinear approaches including variations of tICA and autoencoders. 136,195 To avoid the issues discussed so far, others have opted to skip the dimensionality reduction stage by using structural properties such as root-mean-square deviation (RMSD) 196,197 and contact maps. 195 The stage regarding data representation ends with clustering the conformational snapshots into discrete states using unsupervised ML protocols such as the k-centers and k-means methods.…”
Section: Application Of Machine Learningmentioning
confidence: 99%
“…136,195 To avoid the issues discussed so far, others have opted to skip the dimensionality reduction stage by using structural properties such as root-mean-square deviation (RMSD) 196,197 and contact maps. 195 The stage regarding data representation ends with clustering the conformational snapshots into discrete states using unsupervised ML protocols such as the k-centers and k-means methods. 198 Given the multiple subjective decisions involved in selecting features and algorithms to represent the database, MSM building must be allied with validation strategies.…”
Section: Application Of Machine Learningmentioning
confidence: 99%
“…Recently, the autoencoder framework has been extended to model time-series data [23][24][25][26][27][28][29] . Analysis in these applications typically involves mapping time-series data to latent spaces with the same dimensionality as the length of the initial time-series data and has not focused on approximating a propagator for the time-series data; however, there are a couple of notable exceptions.…”
Section: Msmmentioning
confidence: 99%
“…Analysis in these applications typically involves mapping time-series data to latent spaces with the same dimensionality as the length of the initial time-series data and has not focused on approximating a propagator for the time-series data; however, there are a couple of notable exceptions. Doerr and De Fabritis 29 recently compared a simple autoencoder to other methods for dimensionality reduction of biophysical simulation data. Wehmeyer and Noe introduced a time lag into an autoencoder (TAE) framework to describe dynamics 23 .…”
Section: Msmmentioning
confidence: 99%
“…As such, the autoencoder aims at discovering a latent space (embedding) that faithfully describes the essential features of the high-dimensional input data. This makes autoencoders well suited for constructing low-dimensional FELs from molecular simulation data [22,23,24].…”
Section: Introductionmentioning
confidence: 99%