“…The residue level information includes: (a) single valued amino acid type (all the necessary information for the correct folding of a protein is encoded in its amino acid sequence [26]); (b) seven physicochemical properties of amino acid (different types, short or long, disordered regions in protein are found to have distinguished physicochemical properties); (c) twenty PSSM's (position specific scoring matrix) indicating the evolutionary information accumulated in each residue position of a protein sequence; (d) three predicted secondary structure (helix, strand and coil) probabilities from SPINE-X [27], one predicted accessible surface area (ASA) normalized by the ASA of an extended conformation (Ala-XAla) [28] and two predicted backbone torsion angle (phi, psi) fluctuations [29] since disordered residues are characterized by lack of stable secondary structure [30], highly exposed area and angle fluctuations; (e) one monogram and twenty bigrams computed from PSSM [31] representing the conserved evolutionary information of PSSM transformed from primary structure level to three dimensional structure level, which are normalized by the median of normal density distribution of monogram and bigram values in their logarithmic space; (f) one indicator for terminal residues (five residues from Nterminal as {−1.0, −0.8, −0.6, −0.4, −0.2}, five residue from C-terminal from {+1.0, +0.8, +0.6, +0.4, +0.2} respectively, with the rest as 0.0). Finally, before feeding the features into the classifier, neighboring residue's information is aggregated using a sliding window of 21 residues (10 residues on each residue to be predicted), resulting in 21 × 56 = 1176 features per residue.…”