Materialized XML views are a popular technique for integrating data from possibly distributed and heterogeneous data sources. However, the problem of the incremental maintenance of such XML views poses new challenges which to date remain unaddressed.One, XML views not only filter the data, but may radically restructure it to construct new XML nested document structures. Moreover, order is inherent in the XML model, and XML views reflect both the implicit document order of the underlying sources and the order explicitly imposed in the view definition. Therefore, order also has to be preserved at view maintenance time.In this thesis we present an algebraic approach for the incremental maintenance of XQuery views, called VOX (View maintenance for Ordered XML). To the best of our knowledge, this is the first solution to order-preserving XML view maintenance. Our strategy correctly transforms an update to source XML data into sequences of updates that refresh the view. Our technique is based on an algebraic representation of the XQuery view expression using an XML algebra. The XML algebra has ordered bag semantics; hence most of the operators logically are order preserving. We propose an order-encoding mechanism that migrates the XML algebra to (non-ordered) bag semantics, no longer requiring most of the operators to be order-aware. Furthermore, this now allows most of the algebra operators to become distributive over update operations. This transformation brings the problem of maintaining XML views one step closer to the problem of maintaining views in other (unordered) data models. We are thus now able to adopt some of the existing (relational) maintenance techniques towards our goal of efficient order-sensitive XQuery view maintenance. In addition we develop a full set of rules for propagating updates through XML specific operations. We have proven the correctness of the VOX view maintenance approach. A full implementation of VOX on top of RAINBOW, the XML data management system developed at WPI, has been completed. Our experimental results, performed using the data and queries provided by the XMark benchmark, confirm that incremental XML view maintenance indeed is significantly faster than complete recomputation in most cases. Incremental maintenance is shown to outperform recomputation even for large updates.ii Acknowledgements First, I would like to express my sincere appreciation and gratitude to my advisor Prof. Elke Rundensteiner for her help, guidance, support and encouragement. Without her feedback, ideas, suggestions, incredible responsiveness and the time she always had for me, this thesis would not have been achieved. I would also like to thank her for guiding me throughout my graduate studies.I thank my reader, Prof. Carolina Ruiz for her valuable feedback.
Query processing over XML data sources has emerged as a popular topic. XML is an ordered data model and XQuery expressions return results that have a well-defined order. However little work on how order is supported in XML query processing has been done to date. In this paper we study the challenges related to handling order in the XML context, namely challenges imposed by the XML data model, by the variety of distinct XML operators and by incremental view maintenance. We have proposed an efficient solution that addresses these issues. We use a key encoding for XML nodes that supports both node identity and node order. We have designed order encoding rules based on the XML algebraic query execution data model and on node encodings that does not require any actual sorting for intermediate results during execution. Our approach supports more efficient incremental view maintenance as it makes most XML operators distributive with respect to bag union. Our approach is implemented in the context of Rainbow [25], an XML data management system developed at WPI. We prove the correctness of our order encoding approach, namely that it ensures order handling for query processing and for view maintenance. We also show, through experiments, that the overhead of maintaining order in our approach is indeed neglectible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.