2020
DOI: 10.1371/journal.pone.0239154
|View full text |Cite
|
Sign up to set email alerts
|

Assessing the low complexity of protein sequences via the low complexity triangle

Abstract: Background Proteins with low complexity regions (LCRs) have atypical sequence and structural features. Their amino acid composition varies from the expected, determined proteome-wise, and they do not follow the rules of structural folding that prevail in globular regions. One way to characterize these regions is by assessing the repeatability of a sequence, that is, calculating the local propensity of a region to be part of a repeat. Results We combine two local measures of low complexity, repeatability (usi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 38 publications
0
7
0
Order By: Relevance
“…These LCRs overlap in their definitions [6]. For example, CCs often have reduced types of amino acids and have repetitive properties that make them similar to compositionally biased regions [20]. Both CBRs and homorepeats can be of 20 types, one for each amino acid, while CCs and IDRs are not specifically associated to any amino acid.…”
Section: Prediction Of Lcrs In Htt-interactorsmentioning
confidence: 99%
“…These LCRs overlap in their definitions [6]. For example, CCs often have reduced types of amino acids and have repetitive properties that make them similar to compositionally biased regions [20]. Both CBRs and homorepeats can be of 20 types, one for each amino acid, while CCs and IDRs are not specifically associated to any amino acid.…”
Section: Prediction Of Lcrs In Htt-interactorsmentioning
confidence: 99%
“…However, another prominent phenomenon, exhibiting rates up to 100,000 times higher than point mutations, is the expansion and contractions of low-complexity regions (LCR) (4)(5)(6). In the amino acid sequence context, the regions of amino acids (single or multiple) characterized by a reduced diversity of residues below the average unbiased composition are termed low-complexity regions (LCR) (7)(8)(9). While LCR within coding regions were previously conjectured to be analogous to "junk" DNA sequences, it is now firmly established that they play pivotal functions in numerous essential biological processes, including recombination (10), protein kinases (11), transcription regulation (12,13), as well as neurogenesis, DNA repair (10), cell adhesion, development, cognition, and immunological responses (14)(15)(16), among others.…”
Section: Introductionmentioning
confidence: 99%
“…Taxonomic dependence yields species-specific amino acid frequencies in the proteome. The regions containing fewer amino acid types than expected, defined by a set of compositionally biased residues and having lower amino acid residues diversity that differs from proteins' average amino acid composition, are known as low-complexity regions (LCRs) (Golding 1999;Radó-Trilla and Albà 2012;Peng et al 2015;Mier and Andrade-Navarro 2020;Lee et al 2022). LCRs exhibit a range of amino acid compositions ranging from an aperiodic accumulation of limited amino acids to stretches consisting of a single amino acid residue (homopeptide repeat) (Haerty and Brian Golding 2010;Radó-Trilla and Albà 2012;Chaudhry et al 2018).…”
Section: Introductionmentioning
confidence: 99%
“…The frequency of repeat-containing proteins is much higher in mammals compared to birds, fishes, or amphibians (Faux et al 2005). LCRs have a lower propensity to form structured domains and are much less evolutionarily conserved than globular domains, which are the most common protein domains (Mier and Andrade-Navarro 2020;Kastano et al 2021). Even though the LCRs vary in composition/purity of the stretch, studies are primarily focused on single amino acid (homopeptide) repeats, probably owing to their ease of search in the protein datasets (Albà and Guigó 2004;Faux et al 2005;Radó-Trilla and Albà 2012).…”
Section: Introductionmentioning
confidence: 99%