2015
DOI: 10.1186/s12859-015-0586-0
|View full text |Cite
|
Sign up to set email alerts
|

ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins

Abstract: BackgroundThe exponential growth of protein structural and sequence databases is enabling multifaceted approaches to understanding the long sought sequence-structure-function relationship. Advances in computation now make it possible to apply well-established data mining and pattern recognition techniques to these data to learn models that effectively relate structure and function. However, extracting meaningful numerical descriptors of protein sequence and structure is a key issue that requires an efficient a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
86
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 65 publications
(86 citation statements)
references
References 55 publications
(56 reference statements)
0
86
0
Order By: Relevance
“…These properties are usually selected to provide a quantitative basis for an intuitive picture of the physical chemistry of amino acids. Subsequent analysis is carried out using either a detailed description of the sequences of interest, which involves a consideration of local sequence characteristics, or a set of sequenceaveraged property values, which involves discussion of global sequence characteristics (1,2).…”
mentioning
confidence: 99%
“…These properties are usually selected to provide a quantitative basis for an intuitive picture of the physical chemistry of amino acids. Subsequent analysis is carried out using either a detailed description of the sequences of interest, which involves a consideration of local sequence characteristics, or a set of sequenceaveraged property values, which involves discussion of global sequence characteristics (1,2).…”
mentioning
confidence: 99%
“…ProtDCal is a computational package for encoding the sequences and structures of proteins into numerical descriptors. These descriptors are the input to machine‐learning techniques (artificial neural networks, support vector machine, and random forest, among others) used for the development of novel predictors of protein functions and properties.…”
Section: Resultsmentioning
confidence: 99%
“…In this context, ProtDCal is a software package that transforms protein sequences or 3D‐structures into general‐purpose numerical descriptors, accounting for both global and local information . Due to its complementary performance with respect to other well‐established tools in the field like PROFEAT and PseAcc (later extended to Pse‐in‐one), ProtDCal has been used in a number of studies .…”
Section: Introductionmentioning
confidence: 99%
“…However, there is high degree of sequence similarity, especially in the Fc region, and this would mean that appropriate techniques such as benchmarking would have to be incorporated to select relevant descriptor sets (van Westen et al 2013a, b). Descriptor for proteins molecules can be generated by different software such as PseAAC, Protein Recon, PROFEAT and ProtDCal, of which ProtDCal, a freely available tool with a friendly graphical user interface, has the capacity to generate a higher number of non-redundant of molecular descriptors for proteins from FASTA or PDB files (Ruiz-Blanco et al 2015). Another possible concern is that primary sequence-based descriptors do not take into account neither interactions between amino acid residues nor the antibody-antigen and antibody-receptor interaction space.…”
Section: Discussion: Status Quo and Scope For Mab-based Applicationmentioning
confidence: 99%