2018
DOI: 10.1093/bioinformatics/bty140
|View full text |Cite
|
Sign up to set email alerts
|

iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences

Abstract: Summary: Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding sc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
370
0
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 535 publications
(394 citation statements)
references
References 38 publications
(33 reference statements)
0
370
0
1
Order By: Relevance
“…Accordingly, it remains a significant challenge in the future to identify further useful encoding schemes. There is much promise in this aspect from the availability of some recently developed powerful toolkits and Web servers for extracting a wide range of features, including Pse-Analysis [87], Bio-Seq Analysis [85], Pse-in-One [91], repDNA [81] and iFeature [134]. These toolkits could enable us to consider a much greater combination of different types of feature encoding schemes and explore the possibility of evolving iProt-Sub to a more robust framework while preserving or enhancing its model accuracy.…”
Section: Limitations and Future Workmentioning
confidence: 99%
“…Accordingly, it remains a significant challenge in the future to identify further useful encoding schemes. There is much promise in this aspect from the availability of some recently developed powerful toolkits and Web servers for extracting a wide range of features, including Pse-Analysis [87], Bio-Seq Analysis [85], Pse-in-One [91], repDNA [81] and iFeature [134]. These toolkits could enable us to consider a much greater combination of different types of feature encoding schemes and explore the possibility of evolving iProt-Sub to a more robust framework while preserving or enhancing its model accuracy.…”
Section: Limitations and Future Workmentioning
confidence: 99%
“…Enhanced grouped amino acid composition (EGAAC). EGAAC was first proposed by Chen et al (Chen, et al, 2018) and is the improved version of GAAC features. GAAC divides 20 standard amino acids into five groups based on their physical and chemical properties.…”
Section: Composition-transition-distribution (Ctd)mentioning
confidence: 99%
“…At present, PAAC has 146 shown good properties in proteomics field (Cui, et al, 2019;Qiu, et al, 2018;Yu, et al, 147 2017a; Yu, et al, 2017b;Yu, et al, 2017c). Auto includes 148 Moran, and Geary (Chen, Zhang, Ma & Yu, 2019;Chen, et al, 2018). It represents the 149 physicochemical, position information, and the seven physicochemical properties in Auto can 150 be obtained in Supplementary Table S1.…”
Section: Physicochemical Information 144mentioning
confidence: 99%
“…In CTD (Chen, et al, 2018), amino acids are grouped into three groups based on 188 hydrophobicity: polar (P), neutral (N), and hydrophobic (H). Using ( The T descriptor first converts the original sequence into a replacement sequence, and T 197 includes three characteristics, the dipeptide composition frequency from the polar group to the 198 neutral group and the composition frequency from the neutral group to the polar group.…”
Section: -Dimensional 2-gram Features the Dimension Of MMI Is 119mentioning
confidence: 99%