2009
DOI: 10.3390/a2020692
|View full text |Cite
|
Sign up to set email alerts
|

Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors

Abstract: This work presents a generalized approach for the fast structural alignment of thousands of macromolecular structures. The method uses string representations of a macromolecular structure and a hash table that stores n-grams of a certain size for searching. To this end, macromolecular structure-to-string translators were implemented for protein and RNA structures. A query against the index is performed in two hierarchical steps to unite speed and precision. In the first step the query structure is translated i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
26
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(26 citation statements)
references
References 46 publications
(55 reference statements)
0
26
0
Order By: Relevance
“…Amongst the published methods, only LaJolla (Bauer et al, 2009) could be locally installed for comparison. An all against all comparison was attempted.…”
Section: Resultsmentioning
confidence: 99%
“…Amongst the published methods, only LaJolla (Bauer et al, 2009) could be locally installed for comparison. An all against all comparison was attempted.…”
Section: Resultsmentioning
confidence: 99%
“…For every alphabet, the inner products of the normalized N-GRAM vectors were used to generate similarity measure matrices. This approach provided a parameter independent classification analysis based on a text modeling technique [1,35].…”
Section: N-grammentioning
confidence: 99%
“…secondary structure elements (SSEs) [28], or statistically derived geometrical local descriptions, such as overlapping fragment units of identical length [4,1,6,7,13,33,37,40,47,51,8]. Such fragment descriptions, or structural alphabets, offer an important technical advantage: they allow one to recode a protein 3D structure into a sequence of characters that can be compared through fast sequence methods.…”
Section: Introductionmentioning
confidence: 99%
“…SARA uses a set of unit vectors derived from consecutive nucleotides to represent each nucleotide, which can be compared with other nucleotide using unit-vector root mean square (URMS) as distance (1,18). LaJolla uses an n-gram model to analyze sequences derived from nucleotide torsion angles (4). Similarly, PRIMOS/AMIGOS (5) and DIAL (6) also represent nucleotides with torsion angles and align the sequences encoded by the torsion angle representation.…”
Section: Introductionmentioning
confidence: 99%