Finding optimum and near optimum edit scripts between an XML document and a schema is essential to correcting invalid XML documents. In this paper, we consider finding K optimum edit scripts between an XML document and a regular tree grammar. We first prove that the problem is NP-hard. We next show a pseudopolynomial-time algorithm for solving the problem.
SUMMARYDTDs are continuously updated according to changes in the real world. Let t be an XML document valid against a DTD D, and suppose that D is updated by an update script s. In general, we cannot uniquely "infer" a transformation of t from s, i.e., we cannot uniquely determine the elements in t that should be deleted and/or the positions in t that new elements should be inserted into. In this paper, we consider inferring K optimum transformations of t from s so that a user finds the most desirable transformation more easily. We first show that the problem of inferring K optimum transformations of an XML document from an update script is NP-hard even if K = 1. Then, assuming that an update script is of length one, we show an algorithm for solving the problem, which runs in time polynomial of |D|, |t|, and K.
SUMMARYFinding an appropriate data transformation between two schemas has been an important problem. In this paper, assuming that an update script between original and updated DTDs is available, we consider inferring a transformation algorithm from the original DTD and the update script such that the algorithm transforms each document valid against the original DTD into a document valid against the updated DTD. We first show a transformation algorithm inferred from a DTD and an update script. We next show a sufficient condition under which the transformation algorithm inferred from a DTD d and an update script is unambiguous, i.e., for any document t valid against d, elements to be deleted/inserted can unambiguously be determined. Finally, we show a polynomial-time algorithm for testing the sufficient condition.
An XML document is usually stored with its schema so that the structural consistency of the document is ensured. In general, schemas are continuously updated according to changes in real world. Thus, we have to precisely know how a schema is updated to keep the validity of the XML documents. In order to know how a schema is updated, we need to extract the difference between "old" and "new" schemas. However, schemas are recently becoming larger and more complex, thus it becomes more difficult to know how a schema is updated. In this paper, we consider the problem of extracting the difference between regular tree grammars, a popular formal model of XML schema languages. We first show that the problem is NP-hard. Then we give a sufficient condition under which the problem can be solved efficiently, and present a polynomial-time algorithm for solving the problem under the sufficient condition. Finally, we show some experimental results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.