This paper proposes a model of massive heterogeneous data integration system based on Lucene and XQuery. This model shields distribution and heterogeneity of resources and achieves transparent access using materialized view of database. The query efficiency is increased due to the highly effective categorization algorithm to segment data as an index with open source tool Lucene. Further, the model makes full use of the advantage of XQuery, which can process not only structured data but also non-structured data so as to solve the significant difference among various data sources as well as the efficiency of massive data access.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.