Strings are a natural representation of biological data such as DNA, RNA and protein sequences. The problem of finding a string that summarizes a set of sequences has direct application in relative compression algorithms for genome and proteome analysis, where reference sequences need to be chosen. Median strings have been used as representatives of a set of strings in different domains. However, several formulations of those problems are NP-Complete. Alternatively, heuristic approaches that iteratively refine an initial coarse solution by applying edit operations have been proposed. Recently, we investigated the selection of the optimal edit operations to speed up convergence without spoiling the quality of the approximated median string. We propose a novel algorithm that outperforms state of the art heuristic approximations to the median string in terms of convergence speed by estimating the effect of a perturbation in the minimization of the expressions that define the median strings. We present corpus of comparative experiments to validate these results.
We study optimization problems for the Euclidean minimum spanning tree (MST) on imprecise data. To model imprecision, we accept a set of disjoint disks in the plane as input. From each member of the set, one point must be selected, and the MST is computed over the set of selected points. We consider both minimizing and maximizing the weight of the MST over the input. The minimum weight version of the problem is known as the minimum spanning tree with neighborhoods (MSTN) problem, and the maximum weight version (max-MSTN) has not been studied previously to our knowledge. We provide deterministic and parameterized approximation algorithms for the max-MSTN problem, and a parameterized algorithm for the MSTN problem. Additionally, we present hardness of approximation proofs for both settings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.