2009
DOI: 10.1002/bit.22537
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of protein solubility in Escherichia coli using logistic regression

Abstract: In this article we present a new and more accurate model for the prediction of the solubility of proteins overexpressed in the bacterium Escherichia coli. The model uses the statistical technique of logistic regression. To build this model, 32 parameters that could potentially correlate well with solubility were used. In addition, the protein database was expanded compared to those used previously. We tested several different implementations of logistic regression with varied results. The best implementation, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
57
1

Year Published

2013
2013
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 77 publications
(64 citation statements)
references
References 32 publications
2
57
1
Order By: Relevance
“…(Table 3) Cysteine fraction, hydrophobicity-related parameters, approximate charge average and fractions of amino acids. The dataset for this study contained 212 protein sequences and the accuracy being reported was 93.9% [19]. Another method was proposed by Samak et al in which dataset contained almost 1600 protein sequences and the features reported were 39 in the count.…”
Section: Prediction Of Solubility and Performance Evaluationmentioning
confidence: 96%
“…(Table 3) Cysteine fraction, hydrophobicity-related parameters, approximate charge average and fractions of amino acids. The dataset for this study contained 212 protein sequences and the accuracy being reported was 93.9% [19]. Another method was proposed by Samak et al in which dataset contained almost 1600 protein sequences and the features reported were 39 in the count.…”
Section: Prediction Of Solubility and Performance Evaluationmentioning
confidence: 96%
“…In previous study [30], it was found that overexpression caused conditions where even soluble proteins would form inclusion bodies because the cell became overly crowded. Accordingly, it was predicted that the inclusion body formation of fusion protein DsbA-trypsin may partly due to overexpression because under normal expression the protein could fold correctly and be soluble.…”
Section: Expression and Detection Of Dsba-trypsinmentioning
confidence: 98%
“…molecular weight, pI theoretical isoelectric point, −R number of negative-charged residues (Arg+Lys), +R number of positive-charged residues (Asp+Glu), EC extinction coefficient at 280 nm, Inst. II instability index, AI aliphatic index, GRAVY grand average hydropathicity approach which uses parameters such as molecular weight, amino acid fractions, aliphatic index, alpha-helix propensity, beta-sheet propensity, average pI, approximate charge average, and hydrophilicity index for prediction of solubility of recombinant protein [63]. …”
Section: Evaluation Of the Primary Structure Of Designed Constructs Amentioning
confidence: 99%