2022
DOI: 10.1101/2022.01.07.475295
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Covariant Fitness Clusters Reveal Structural Evolution of SARS-CoV-2 Polymerase Across the Human Population

Abstract: Understanding the fitness landscape of viral mutations is crucial for uncovering the evolutionary mechanisms contributing to pandemic behavior. Here, we apply a Gaussian process regression (GPR) based machine learning approach that generates spatial covariance (SCV) relationships to construct stability fitness landscapes for the RNA-dependent RNA polymerase (RdRp) of SARS-CoV-2. GPR generated fitness scores capture on a residue-by-residue basis a covariant fitness cluster centered at the C487-H642-C645-C646 Zn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 112 publications
0
2
0
Order By: Relevance
“…The VSP analysis was performed as previously described 1, 4143 using gstat package (V2.0) 66, 67 in R. VSP is built on Gaussian process regression (GPR) based machine learning. A special form of GPR machine learning that has been developed in geostatistics, Ordinary Kriging 68 , is used to model the spatial dependency as a variogram to interpolate the unmeasured value to construct the phenotype landscape for AAT.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The VSP analysis was performed as previously described 1, 4143 using gstat package (V2.0) 66, 67 in R. VSP is built on Gaussian process regression (GPR) based machine learning. A special form of GPR machine learning that has been developed in geostatistics, Ordinary Kriging 68 , is used to model the spatial dependency as a variogram to interpolate the unmeasured value to construct the phenotype landscape for AAT.…”
Section: Methodsmentioning
confidence: 99%
“…To probe the molecular basis for the balance between misfolding and function of AAT, we measured the intracellular monomer and polymer levels, secreted monomer and polymer levels, and NE inhibitory activity for 75 AAT variants distributed across the protein sequence that include both pathogenic and benign variants found in the population. To address the complex folding dynamics likely differentially impacted by these variants, we applied a Gaussian process regression (GPR) based machine learning approach termed variation spatial profiling (VSP) 1, [41][42][43][44] .…”
mentioning
confidence: 99%
“…To begin to understand the mechanism by which GRP94 modulation corrects the AAT fold on a residue-by-residue basis at atomic resolution, we applied our Gaussian process (GP)-spatial covariance (SCV) (GP-SCV) principled machine learning approach through variation spatial profiling (VSP) 1,[17][18][19][20][21] . GP-SCV principled relationships generated through VSP is based on a statistical paradigm used to find value in complex physical landscapes (see "Methods").…”
Section: Grp94 Atpase Modulation Rescues Aatd Phenotypes For Aat-zmentioning
confidence: 99%
“…To address the central problem of information flow from the genome to the proteome in human biology in the context of the complex folding dynamics impacted by heat shock protein 90 (Hsp90) family members 16 that constitute a central ATP-dependent chaperone system in the cell, we developed a Gaussian process (GP) regressionbased machine learning 1,[17][18][19][20][21] approach to learn the basal state of the protein fold and its response to proteostasis management. GP can use a sparse collection of variants across the worldwide population and their associated phenotypes as a collective input to generate spatial covariance (SCV) maps of the response of every residue in the protein sequence and its residue-residue responses to environmental changes, including pharmacological intervention 1,[17][18][19][20][21] . GP-SCV principled modeling captures as a collective the variant changes that report on the evolutionary trajectory of the entire wild-type (WT) protein fold at atomic resolution that impacts health and disease of the host in response to the environment and natural selection 1,[17][18][19][20][21] .…”
mentioning
confidence: 99%
See 1 more Smart Citation