2016
DOI: 10.1586/14789450.2016.1155986
|View full text |Cite
|
Sign up to set email alerts
|

Pan-proteomics, a concept for unifying quantitative proteome measurements when comparing closely-related bacterial strains

Abstract: The comparison of proteomes between genetically heterogeneous bacterial strains may offer valuable insights into physiological diversity and function, particularly where such variation aids in the survival and virulence of clinically-relevant strains. However, reports of such comparisons frequently fail to account for underlying genetic variance. As a consequence, the current knowledge regarding bacterial physiological diversity at the protein level may be incomplete or inaccurate. To address this, greater con… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(17 citation statements)
references
References 53 publications
0
17
0
Order By: Relevance
“…Sequences were exported from ‘.gbk files’ using software Artemis v.16.0.0 (Sanger Institute, Hinxton, Cambridgeshire, UK) (Rutherford et al ., ). The protein sequences of all CDSs from each genome of L. lactis were pooled and clustered by CD‐HIT software (Li and Godzik, ), using a parameter of 100% similarity to create ‘.fasta file’, with a subset of non‐redundant sequences, retrieving a pan‐proteome database (Broadbent et al ., ).…”
Section: D Nanouplc‐udmse Data Acquisitionmentioning
confidence: 97%
See 1 more Smart Citation
“…Sequences were exported from ‘.gbk files’ using software Artemis v.16.0.0 (Sanger Institute, Hinxton, Cambridgeshire, UK) (Rutherford et al ., ). The protein sequences of all CDSs from each genome of L. lactis were pooled and clustered by CD‐HIT software (Li and Godzik, ), using a parameter of 100% similarity to create ‘.fasta file’, with a subset of non‐redundant sequences, retrieving a pan‐proteome database (Broadbent et al ., ).…”
Section: D Nanouplc‐udmse Data Acquisitionmentioning
confidence: 97%
“…Sequences were exported from '.gbk files' using software Artemis v.16.0.0 (Sanger Institute, Hinxton, Cambridgeshire, UK) (Rutherford et al, 2000). The protein sequences of all CDSs from each genome of L. lactis were pooled and clustered by CD-HIT software (Li and Godzik, 2006), using a parameter of 100% similarity to create '.fasta file', with a subset of non-redundant sequences, retrieving a panproteome database (Broadbent et al, 2016). The search conditions were: maximum allowed missed cleavages by trypsin be up to 1; maximum protein mass = 600 kDA, modifications by carbamidomethyl of cysteine (C) (fixed), acetyl N-terminal (variable), and oxidation of methionine (variable); peptide tolerance of 10 ppm, fragment mass error tolerance of 20 ppm and a default maximum false discovery rate (FDR) value of 4%.…”
Section: Processing Of Mass Spectral Datamentioning
confidence: 99%
“…In this pilot study, we have presented two persuasive applications of lipogram decomposition: the analysis of UniRef50 and the segregation of proteomes. From this it is clear -for collections of protein sequences-at the level of the proteome, pan-proteome [18], and above -that the lipogram and the lipogram decomposition provides an interesting, and potentially extremely useful, linguistic construct that adds an additional layer to conventional protein sequence analysis, opening up unprecedented avenues for future exploration.…”
Section: Resultsmentioning
confidence: 99%
“…Choosing any arbitrary set of sequences, we can explore how the constituent sequences of that set distribute into the available lipogram dimensions. Such a set could be a proteome, a pan-proteome [18], a protein family or structural superfamily [19,20], comprising orthologues and paralogues from many In what follows, we use the lipogram decomposition in combination with other sequence properties, such as sequence complexity [16,17], to produce a multivariate data structure around which we can build more complex and more predictive analysis of sequence sets. Table 1: The Lipogram Decomposition A protein may lack a single residue-alanine, tryptophan, or any of the other twenty-and this sequence will have a lipogram dimension of 19.…”
Section: Lipogram Terminology and Guided Walkmentioning
confidence: 99%
“…better detection of phenotypically related bacteria based on their expressed protein content or more effective searching of conserved genes, orthologue genes and pseudogenes. This leads to more accurate estimation of core or pan genome for diversification of closely-related pathogenic bacteria [ [39] , [40] , [41] ].…”
Section: Introductionmentioning
confidence: 99%