2023
DOI: 10.1093/bib/bbad242
|View full text |Cite
|
Sign up to set email alerts
|

EnGens: a computational framework for generation and analysis of representative protein conformational ensembles

Abstract: Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this structure is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of proteins is essential to understand their mechanism of action. Sets of carefully chosen conformations can summarize such complex landscapes and provide better insights into protein function than single conformations. We refer to these sets as rep… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 74 publications
0
5
0
Order By: Relevance
“…The clustering method performed on the contact function matrix (1) is based on the combination of a dimensionality reduction technique with an efficient clustering algorithm, similarly to state-of-the-art approaches [30,31]. Here, we opt for UMAP [20] to Ąrst embed the data (1) into a 10-dimensional space.…”
Section: Clustering Pipeline and Ensemble Characterizationmentioning
confidence: 99%
See 2 more Smart Citations
“…The clustering method performed on the contact function matrix (1) is based on the combination of a dimensionality reduction technique with an efficient clustering algorithm, similarly to state-of-the-art approaches [30,31]. Here, we opt for UMAP [20] to Ąrst embed the data (1) into a 10-dimensional space.…”
Section: Clustering Pipeline and Ensemble Characterizationmentioning
confidence: 99%
“…In distance-based methods, structural data are featured by Euclidean distances between residue pairs. This metric combined with dimensionality reduction and a clustering algorithm has been previously used to characterize Ćexible proteins [30,31]. Concretely, we implemented the UMAP + HDBSCAN pipeline on the structural data featured with pairwise Euclidean distances between all C β atoms (C α for glycines) to characterize the CHCHD4 MD ensemble.…”
Section: Comparison Of Wario With Other Clustering Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, D c 1 c 2 can be interpreted as a distance in an abstract Euclidean conformer space whose dimension is not known a priori. The D c 1 c 2 can be arranged in a distance matrix D. Clustering of conformers based on D is similar in spirit to clustering based on other pairwise Euclidean distances proposed earlier (Alston et al, 2023;Conev et al, 2023). Distance geometry methods allow embedding D in the conformer space and thus representing the set of conformers in this abstract space.…”
Section: Ensemble Representationmentioning
confidence: 99%
“…Here, we introduce a single‐valued similarity measure that abstracts from the ensemble width. The ensemble analysis module of MMMx complements existing toolkits such as PENSA (Vögele et al, 2022 ), ProDy (Zhang et al, 2021 ), or EnGens (Conev et al, 2023 ), which aim to summarize the information from ensemble structures or molecular dynamics (MD) trajectories into simpler descriptors of the conformational landscape. For example, the interpretation of ensemble models of IDPs and IDRs requires the detection of weak deviations from a polymeric random coil (Alston et al, 2023 ; Ritsch et al, 2021 ).…”
Section: Introductionmentioning
confidence: 99%