2016
DOI: 10.7287/peerj.preprints.2390
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Prediction of amyloidogenicity based on the n-gram analysis

Abstract: Amyloids are proteins associated with the number of clinical disorders (e.g., Alzheimer's, CreutzfeldtJakob's and Huntington's diseases). Despite their diversity, all amyloid proteins can undergo aggregation initiated by 6-to 15-residue segments, called hot spots. To find the patterns defining the hotspots, we trained predictors of amyloidogenicity, using n-grams and random forest classifiers, based on data collected in the AmyLoad database. Only the most informative n-grams, selected by our Quick Permutation … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2017
2017
2019
2019

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 24 publications
(32 reference statements)
0
3
0
Order By: Relevance
“…It is not limited to the commercially available platforms and can also be used with experimental systems by importing data through the universal REDF format, which follows the IETF RFC 4180 standard. dpcReport provides users with advanced tools for data quality control and it incorporates statistical tests for comparing multiple reactions in an experiment (Burdukiewicz et al 2016), currently absent in many dPCR-related software tools. dpcReport provides users with advanced tools for data quality control.…”
Section: Integrated Analysis Of Digital Pcr Experiments In Rmentioning
confidence: 99%
“…It is not limited to the commercially available platforms and can also be used with experimental systems by importing data through the universal REDF format, which follows the IETF RFC 4180 standard. dpcReport provides users with advanced tools for data quality control and it incorporates statistical tests for comparing multiple reactions in an experiment (Burdukiewicz et al 2016), currently absent in many dPCR-related software tools. dpcReport provides users with advanced tools for data quality control.…”
Section: Integrated Analysis Of Digital Pcr Experiments In Rmentioning
confidence: 99%
“…A more complex approach consists on combining multiple quantitative properties into multi-dimensional sequence descriptors, as implemented in a Python package propy [ 4 ]. Quantitative properties of amino acids can also be used to generate reduced alphabets for generative and discriminative models of proteins [ 5 , 6 ].…”
Section: Introductionmentioning
confidence: 99%
“…Interestingly, distribution of amino acid tuples can often be approximated with the power-law distribution (the Zipf’s law) [ 9 ]. Most recently, n-gram-based random forests were sucessfully applied for accurate discrimination between amyloidogenic and non-amyloidogenic peptides [ 6 ]. Several tools for analysis of n-grams in proteins were made available, e.g.…”
Section: Introductionmentioning
confidence: 99%