2010
DOI: 10.1016/j.compbiolchem.2010.03.007
|View full text |Cite
|
Sign up to set email alerts
|

An efficient similarity search based on indexing in large DNA databases

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 8 publications
0
5
0
Order By: Relevance
“…Literature [ 4 ] studies the hash index structure of the one-way hash function and the index retrieval method to search for specific fragments and similar sequences. Literature [ 5 ] also proposed a novel solution for searching for specific DNA sequences. For the construction of the hash index structure, in DNA sequence matching [ 6 ], the commonly used fixed sequences are stored in the DNA database, and the similarity is used to evaluate whether the sequences are matched successfully.…”
Section: Introductionmentioning
confidence: 99%
“…Literature [ 4 ] studies the hash index structure of the one-way hash function and the index retrieval method to search for specific fragments and similar sequences. Literature [ 5 ] also proposed a novel solution for searching for specific DNA sequences. For the construction of the hash index structure, in DNA sequence matching [ 6 ], the commonly used fixed sequences are stored in the DNA database, and the similarity is used to evaluate whether the sequences are matched successfully.…”
Section: Introductionmentioning
confidence: 99%
“…Increasing N can increase the amount of information stored in the vectors, but it also increases the computational cost of generating the vectors and calculating the similarity. According to a suggestion in [1], and based on our own experiments selecting N=1,2,3,4, we has been found that N=2 gives good results for comparisons and is also computationally efficient. The pseudocode for Algorithm 1 outlines the steps to transform an input DNA subsequence into a 48-dimensional numerical vector based on the formulas (1), ( 2), (3), and (4).…”
Section:  N-grams Selectionmentioning
confidence: 99%
“…The similarity value of two vectors is determined by the distance between the two vectors. To calculate this distance, the algorithm presented in [1] is employed. This algorithm calculates the distance between two vectors by finding the maximum number of operations required to transform from vector u to vector v. Algorithm 6 will calculate these values for each pair of (u,v) and storing the result in the variables posDis and negDis.…”
Section: B the Combine Algorithm Transforms Dna Sequences Into Vectorsmentioning
confidence: 99%
See 1 more Smart Citation
“…These databases serve as valuable resources for numerous essential bioinformatics tasks, such as DNA similarity search [1] , sequence alignments [2] , gene annotation [3] , [4] , gene prediction [5] , [6] , and motif finding [7] , [8] . However, as these databases store vast volumes of sequences, performing these bioinformatics tasks is becoming increasingly challenging and complex.…”
Section: Introductionmentioning
confidence: 99%