2020
DOI: 10.7717/peerj.9852
|View full text |Cite
|
Sign up to set email alerts
|

ProminTools: shedding light on proteins of unknown function in biomineralization with user friendly tools illustrated using mollusc shell matrix protein sequences

Abstract: Biominerals are crucial to the fitness of many organism and studies of the mechanisms of biomineralization are driving research into novel materials. Biomineralization is generally controlled by a matrix of organic molecules including proteins, so proteomic studies of biominerals are important for understanding biomineralization mechanisms. Many such studies identify large numbers of proteins of unknown function, which are often of low sequence complexity and biased in their amino acid composition. A lack of u… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(14 citation statements)
references
References 36 publications
(53 reference statements)
1
13
0
Order By: Relevance
“…Seven of the 92 SSPs, all from T. pseudonana, had previously been identified at the protein level: Tp_p150 (Davis et al, 2005), TpSil1 and TpSil3 (Poulsen & Kr€ oger, 2004), TpSil4 (Sumper & Brunner, 2008), and SiMat3, SiMat4, and SiMat7 (Silacanin-1, Sin1) (Kotzsch et al, 2016). The 85 novel proteins were distinguished by an acronym for the species of origin and a numeric identifier (e.g.…”
Section: Previously Characterized Proteins and Known Domainsmentioning
confidence: 99%
“…Seven of the 92 SSPs, all from T. pseudonana, had previously been identified at the protein level: Tp_p150 (Davis et al, 2005), TpSil1 and TpSil3 (Poulsen & Kr€ oger, 2004), TpSil4 (Sumper & Brunner, 2008), and SiMat3, SiMat4, and SiMat7 (Silacanin-1, Sin1) (Kotzsch et al, 2016). The 85 novel proteins were distinguished by an acronym for the species of origin and a numeric identifier (e.g.…”
Section: Previously Characterized Proteins and Known Domainsmentioning
confidence: 99%
“…It has previously been noted that biomineral-associated proteins tend to be biased in amino acid composition and intrinsically disordered (Evans, 2019). However, such qualitative observations have only rarely been quantitatively analyzed or put into perspective to the whole proteomes of the biomineralizing organisms (Skeffington & Donath, 2020). Here we show that the SSPs differ as a group from their respective background proteomes in (i) being significantly more biased in amino acid composition, (ii) exhibiting significantly lower sequence complexity, and (iii) being predicted to be intrinsically disordered over a larger proportion of their lengths (see Fig.…”
Section: Discussionmentioning
confidence: 99%
“…The properties of the protein sequences were investigated using ProminTools (Skeffington & Donath, 2020), which relies on various software tools: low complexity regions were identified using Seg (Wootton & Federhen, 1993) with default parameters; predicted intrinsic disorder was calculated using VSL2 (Peng et al, 2006); biases in amino acid content were identified using the fLPS software (Harrison, 2017) with a p-value cut-off of 10 -6 and bias quantified as described in Skeffington and Donnath, 2020. Motif finding was carried out using Motif-x (Chou & Schwartz, 2011) via ProteinMotifFinder (Skeffington & Donath, 2020). Clustering based on motif content was based on the Ward.D method, and a distance matrix based on the Distance Correlation (Székely & Rizzo, 2014) measure (further details in Methods S2).…”
Section: Bioinformatic Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Here we set out to identify proteins potentially involved in calci cation in E. huxleyi with a suit of proteomics experiments, allowing high con dence identi cations based on behaviors in orthogonal datasets. The fact that many biomineral associated proteins have low sequence complexity, with biased composition and repetitive elements [34][35][36] , means that their genes also tend to have unusual sequence properties. These pose a challenge to genome annotation, and can lead to incorrect or missing gene models, and limit the discovery space of studies relying on that data.…”
mentioning
confidence: 99%