The Longest Common Subsequence (LCS) of two strings A, B is a well studied problem having a wide range of applications. When each symbol of the input strings is assigned a positive weight the problem becomes the Heaviest Common Subsequence (HCS) problem. In this paper we consider a different version of weighted LCS on Position Weight Matrices (PWM). The Position Weight Matrix was introduced as a tool to handle a set of sequences that are not identical, yet, have many local similarities. Such a weighted sequence is a 'statistical image' of this set where we are given the probability of every symbol's occurrence at every text location. We consider two possible definitions of LCS on PWM.For the first, we solve the LCS problem of z sequences in time O (zn z+1 ). For the second, we consider the log-probability version of the problem, prove N P-hardness and provide an approximation algorithm.
Abstract. The problem of finding the longest common subsequence (LCS) of two given strings A 1 and A 2 is a well-studied problem. The constrained longest common subsequence (C-LCS) for three strings A1, A 2 and B 1 is the longest common subsequence of A 1 and A 2 that contains B1 as a subsequence. The fastest algorithm solving the C-LCS problem has a time complexity of O(m 1 m 2 n 1 ) where m 1 , m 2 and n 1 are the lengths of A1, A2 and B1 respectively. In this paper we consider two general variants of the C-LCS problem. First we show that in case of two input strings and an arbitrary number of constraint strings, it is NP-hard to approximate the C-LCS problem. Moreover, it is easy to see that in case of an arbitrary number of input strings and a single constraint, the problem of finding the constrained longest common subsequence is NPhard. Therefore, we propose a linear time approximation algorithm for this variant, our algorithm yields a 1/ m min |Σ| approximation factor, where mmin is the length of the shortest input string and |Σ| is the size of the alphabet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.