2018
DOI: 10.1074/jbc.ra117.001052
|View full text |Cite
|
Sign up to set email alerts
|

A statistical model for improved membrane protein expression using sequence-derived features

Abstract: The heterologous expression of integral membrane proteins (IMPs) remains a major bottleneck in the characterization of this important protein class. IMP expression levels are currently unpredictable, which renders the pursuit of IMPs for structural and biophysical characterization challenging and inefficient. Experimental evidence demonstrates that changes within the nucleotide or amino-acid sequence for a given IMP can dramatically affect expression levels; yet these observations have not resulted in generali… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(6 citation statements)
references
References 87 publications
0
6
0
Order By: Relevance
“…A protein sequence can be encoded by its physical properties or directly by its amino acids (Alipanahi et al, 2015;Bedbrook et al, 2017b;Chang et al, 2016;Fox et al, 2007;Ofer and Linial, 2015;Romero et al, 2013;Saladi et al, 2018). When using physical properties to encode a protein sequence, each individual amino acid is represented by a collection of physical properties, such as its charge or hydrophobicity, and each protein is taken to be a combination of those properties.…”
Section: Introductionmentioning
confidence: 99%
“…A protein sequence can be encoded by its physical properties or directly by its amino acids (Alipanahi et al, 2015;Bedbrook et al, 2017b;Chang et al, 2016;Fox et al, 2007;Ofer and Linial, 2015;Romero et al, 2013;Saladi et al, 2018). When using physical properties to encode a protein sequence, each individual amino acid is represented by a collection of physical properties, such as its charge or hydrophobicity, and each protein is taken to be a combination of those properties.…”
Section: Introductionmentioning
confidence: 99%
“…Decision tree models, such as random forests, have been used to accurately predict enzyme stability. , The Kernel method is a pattern analysis algorithm that avoids explicitly mapping input to typically a higher dimensional feature space but takes advantage of kernel, which captures the similarity between two input data points and maps the raw input to the feature space implicitly . Support vector machine (SVM) is one of the best-known kernel methods and has been applied to predict protein stability and enantioselectivity. Gaussian process is a probability-based predictor, which uses a kernel method to predict an unobserved data point from training data . Unlike the other algorithms mentioned above, Gaussian process does not predict the value of the unobserved data but captures uncertainty by Gaussian distribution.…”
Section: Methodologiesmentioning
confidence: 99%
“…Support vector machines have been used to predict protein thermostability, 33,34,35,36,25,27,26,37 enzyme enantioselectivity, 38 and membrane protein expression and localization. 39 Gaussian process models combine kernel methods with Bayesian learning to produce probabilistic predictions. 40 These models rigorously capture uncertainty, which can provide principled ways to guide experimental design in optimizing protein properties.…”
Section: Choosing a Modelmentioning
confidence: 99%