B. Riva Shalom scite author profile

a b s t r a c tThe Longest Common Subsequence (LCS) is a well studied problem, having a wide range of implementations. Its motivation is in comparing strings. It has long been of interest to devise a similar measure for comparing higher dimensional objects, and more complex structures. In this paper we study the Longest Common Substructure of two matrices and show that this problem is N P -hard. We also study the Longest Common Subforest problem for multiple trees including a constrained version, as well. We show N P -hardness for k > 2 unordered trees in the constrained LCS. We also give polynomial time algorithms for ordered trees and prove a lower bound for any decomposition strategy for k trees.

show abstract

Weighted LCS

Amir

Gotthilf

Shalom

2010

Journal of Discrete Algorithms

View full text Add to dashboard Cite

The Longest Common Subsequence (LCS) of two strings A, B is a well studied problem having a wide range of applications. When each symbol of the input strings is assigned a positive weight the problem becomes the Heaviest Common Subsequence (HCS) problem. In this paper we consider a different version of weighted LCS on Position Weight Matrices (PWM). The Position Weight Matrix was introduced as a tool to handle a set of sequences that are not identical, yet, have many local similarities. Such a weighted sequence is a 'statistical image' of this set where we are given the probability of every symbol's occurrence at every text location. We consider two possible definitions of LCS on PWM.For the first, we solve the LCS problem of z sequences in time O (zn z+1 ). For the second, we consider the log-probability version of the problem, prove N P-hardness and provide an approximation algorithm.

show abstract

LCSk: A refined similarity measure

Benson

Levy

Maimoni

et al. 2016

Theoretical Computer Science

View full text Add to dashboard Cite

In this paper we define a new similarity measure: LCSk, aiming at finding the maximal number of k length substrings matching in both input strings while preserving their order of appearance, for which the traditional LCS is a special case, where k = 1. We examine this generalization in both theory and practice. We first describe its basic solution and give an experimental evidence in real data for its ability to differentiate between sequences that are considered similar according to the LCS measure. We then examine extensions of the LCSk definition to LCS in at least k-length substrings (LCS ≥ k) and 2-dimensional LCSk and also define complementary EDk and ED ≥ k distances.

show abstract

Longest Common Subsequence in k Length Substrings

Benson

Levy

Shalom

2013

View full text Add to dashboard Cite

In this paper we define a new problem, motivated by computational biology, LCSk aiming at finding the maximal number of k length substrings, matching in both input strings while preserving their order of appearance. The traditional LCS definition is a special case of our problem, where k = 1. We provide an algorithm, solving the general case in O(n 2 ) time, where n is the length of the input strings, equaling the time required for the special case of k = 1. The space requirement of the algorithm is O(kn). We also define a complementary EDk distance measure and show that EDk(A, B) can be computed in O(nm) time and O(km) space, where m, n are the lengths of the input sequences A and B respectively.

show abstract

Dictionary Matching with One Gap

Amir

Levy

Porat

et al. 2014

View full text Add to dashboard Cite

Abstract. The dictionary matching with gaps problem is to preprocess a dictionary D of d gapped patterns P1, . . . , P d over alphabet Σ, where each gapped pattern Pi is a sequence of subpatterns separated by bounded sequences of don't cares. Then, given a query text T of length n over alphabet Σ, the goal is to output all locations in T in which a pat-There is a renewed current interest in the gapped matching problem stemming from cyber security. In this paper we solve the problem where all patterns in the dictionary have one gap with at least α and at most β don't cares, where α and β are given parameters. Specifically, we show that the dictionary matching with a single gap problem can be solved in either O(d log d + |D|) time and O(d log ε d + |D|) space, and query time O(n(β − α) log log d log 2 min{d, log |D|} + occ), where occ is the number of patterns found, or preprocessing time and space: O(d 2 + |D|), and query time O(n(β − α) + occ), where occ is the number of patterns found. As far as we know, this is the best solution for this setting of the problem, where many overlaps may exist in the dictionary.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

B. Riva Shalom

Generalized LCS

Weighted LCS

LCSk: A refined similarity measure

Longest Common Subsequence in k Length Substrings

Dictionary Matching with One Gap

Contact Info

Product

Resources

About