2016
DOI: 10.1016/j.tcs.2015.11.026
|View full text |Cite
|
Sign up to set email alerts
|

LCSk: A refined similarity measure

Abstract: In this paper we define a new similarity measure: LCSk, aiming at finding the maximal number of k length substrings matching in both input strings while preserving their order of appearance, for which the traditional LCS is a special case, where k = 1. We examine this generalization in both theory and practice. We first describe its basic solution and give an experimental evidence in real data for its ability to differentiate between sequences that are considered similar according to the LCS measure. We then e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(19 citation statements)
references
References 16 publications
0
19
0
Order By: Relevance
“…We showed that both the LCS k + problem and the op-LCS k + problem can be solved in O(mn) time. Our result on the LCS k + problem gives a better worst-case running time than previous algorithms [2,15], while the experimental results showed that the previous algorithms run faster than ours on average. Although the op-LCS k + problem looks much more challenging than the LCS k + , since the former cannot be solved by a simple dynamic programming due to the properties of order-isomorphisms, the proposed algorithm achieves the same time complexity as the one for the LCS k + .…”
Section: Resultsmentioning
confidence: 49%
See 3 more Smart Citations
“…We showed that both the LCS k + problem and the op-LCS k + problem can be solved in O(mn) time. Our result on the LCS k + problem gives a better worst-case running time than previous algorithms [2,15], while the experimental results showed that the previous algorithms run faster than ours on average. Although the op-LCS k + problem looks much more challenging than the LCS k + , since the former cannot be solved by a simple dynamic programming due to the properties of order-isomorphisms, the proposed algorithm achieves the same time complexity as the one for the LCS k + .…”
Section: Resultsmentioning
confidence: 49%
“…gov/nuccore/U38845.1, with k = 1, 2, 3, 4, 5. The experimental results under the conditions (1), (2) and (3) The proposed algorithm in Section 3 runs faster than PŽŠ for small k or small alphabets. This is due to that PŽŠ strongly depends on the total number of matching k length substring pairs between input strings, and for small k or small alphabets there are many matching pairs.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Non-metric based similarity approach is an alternative solution to find the similarity indexes of MDSs specifically in presence of outliers. A dynamic programming based LCSS computation algorithm, specifically for the k-length substring problems, was presented in literature to address the aforementioned issue i.e., outliers sensitivity [26], [27]. Zhu et al [28] presented two different approaches to solve the LCSS problem with minimum possible time and space complexities iff n = m, where n and m represent sequence length.…”
Section: Literature Reviewmentioning
confidence: 99%