Data integration often requires a clean abstraction of the di erent formats in which data are stored, and means for specifying the correspondences/relationships between data in di erent worlds and for translating data from one world to another. For that, we introduce in this paper a middleware data model that serves as a basis for the integration task, and a declarative rules language for specifying the integration. We show that using the language, correspondences between data elements can be computed in polynomial time in many cases, and may require exponential time only when insensitivity to order or duplicates are considered. Furthermore, we show that in most practical cases the correspondence rules can be automatically turned into translation rules to map data from one representation to another. Thus, a complete integration task (derivation of correspondences, transformation of data from one world to the other, incremental integration of a new bulk of data, etc.) can be speciÿed using a single set of declarative rules.
Due to the development of the World Wide Web, the integration of heterogeneous data sources has become a major concern of the database community. Appropriate architectures and query languages have been proposed. Yet, the problem of data conversion which is essential for the development of mediators/wrappers architectures has remained largely unexplored.In this paper, we present the YAT system for data conversion. This system provides tools for the specification and the implementation of data conversions among heterogeneous data sources. It relies on a middleware model, a declarative language, a customization mechanism and a graphical interface.The model is based on named trees with ordered and labeled nodes. Like semistructured data models, it is simple enough to facilitate the representation of any data. Its main originality is that it allows to reason at various levels of representation. The YAT conversion language (called YATL) is declarative, rule-based and features enhanced pattern matching facilities and powerful restructuring primitives It allows to preserve or reconstruct the order of collections. The customization mechanism relies on program instantiations:an existing program may be instantiated into a more specific one, and then easily modified. We also present the architecture, implementation and practical use of the YAT prototype, currently under evaluation within the OPAL* project.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.