2005
DOI: 10.1007/11532323_20
|View full text |Cite
|
Sign up to set email alerts
|

Sequence Motif Identification and Protein Family Classification Using Probabilistic Trees

Abstract: Abstract. Efficient family classification of newly discovered protein sequences is a central problem in bioinformatics. We present a new algorithm, using Probabilistic Suffix Trees, which identifies equivalences between the amino acids in different positions of a motif for each family. We also show that better classification can be achieved identifying representative fingerprints in the amino acid chains.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
3
0
1

Year Published

2008
2008
2018
2018

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 6 publications
1
3
0
1
Order By: Relevance
“…The statistical significance of motif prediction was correlated with biological significance, indicating a valuable reason for the motifs analysis as also reported earlier by Zhang et al, (2009). Motifs information can be used for developing resistance genes and makers, to perform clustering (Broin et al, 2015), gene expression analysis study (Jensen et al, 2005, Huber andBulyk, 2006) and discovery of homology relations (Stewart, 2016), family classification (Blekas et al, 2005;Eser et al, 2013), discovery of sub-families in large protein families (Leonardi and Galves, 2005) and new signalling pathways (Ma et al, 2013).…”
Section: Cis-regulatory Elementssupporting
confidence: 63%
“…The statistical significance of motif prediction was correlated with biological significance, indicating a valuable reason for the motifs analysis as also reported earlier by Zhang et al, (2009). Motifs information can be used for developing resistance genes and makers, to perform clustering (Broin et al, 2015), gene expression analysis study (Jensen et al, 2005, Huber andBulyk, 2006) and discovery of homology relations (Stewart, 2016), family classification (Blekas et al, 2005;Eser et al, 2013), discovery of sub-families in large protein families (Leonardi and Galves, 2005) and new signalling pathways (Ma et al, 2013).…”
Section: Cis-regulatory Elementssupporting
confidence: 63%
“…Conserved motifs can provide evidence for further classification into different subgroups as it is possible that proteins within a subgroup, which share identical motifs, are likely to exhibit similar functions. Motifs information can be used to develop resistance genes and markers, perform clustering (Broin et al, 2015 ), assess the gene expression (Jensen et al, 2005 ; Huber and Bulyk, 2006 ), discover the homology relations, classify the families (Blekas et al, 2005 ; Jensen et al, 2005 ), discover the sub-families in large protein families (Leonardi and Galves, 2005 ) and identify new signaling pathways (Ma et al, 2013 ).…”
Section: Resultsmentioning
confidence: 99%
“…Recently, it was proposed an algorithm to estimate the set of sparse contexts and the transition probabilities given by 2.2 (Leonardi and Galves;2005;Leonardi;2006). This algorithm represents internally the set of sparse contexts as a tree, as described above.…”
Section: The Elements In P Jmentioning
confidence: 99%
“…In this work we propose to use the framework of Sparse Probabilistic Suffix Trees (SPST) to analyze the similarity between sequences and to infer the evolution of protein families. SPST was first introduced in Leonardi and Galves (2005) as a generalization of the PST algorithm, proposed in Ron et al (1996). SPST has shown to be useful in protein modeling and classification, performing better than the PST algorithm (Leonardi;2006).…”
Section: Introductionmentioning
confidence: 99%