We develop and study the concept of similarity functions for q-ary sequences. For the case q = 4, these functions can be used for a mathematical model of the DNA duplex energy [1,2], which has a number of applications in molecular biology. Based on these similarity functions, we define a concept of DNA codes [1]. We give brief proofs for some of our unpublished results [3] connected with the well-known deletion similarity function [4][5][6]. This function is the length of the longest common subsequence; it is used in the theory of codes that correct insertions and deletions [5]. Principal results of the present paper concern another function, called the similarity of blocks. The difference between this function and the deletion similarity is that the common subsequences under consideration should satisfy an additional biologically motivated [2] block condition, so that not all common subsequences are admissible. We prove some lower bounds on the size of an optimal DNA code for the block similarity function. We also consider a construction of close-to-optimal DNA codes which are subcodes of the parity-check one-error-detecting code in the Hamming metric [7]. 0032-9460/05/4104-0349 c 2005 Pleiades Publishing, Inc.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.