Data structure description, conceptual modeling, and logic reasoning for knowledge discovery are three critical factors for the integration of information with heterogeneity. In particular, technologies of NoSQL databases and Internet of Things raise an urgent requirement for a uniform expression of heterogeneous data, and little attention has been paid to researches on the integration of NoSQL databases with traditional data models, as well as the semantic description of big data. To tackle these problems, in this paper, a concept-and-relation-oriented grid data model called GODM model is first proposed based on the definitions of Monad, Compounder, Relation, etc. Then, the GODM model is utilized to uniformly describe traditional data models and NoSQL data models, which eliminates structure differences of heterogeneous data. Next, based on the GODM relation mechanism, an extendable semantic system is built up by choosing SHOIQ(D) description logic as the example to establish the correspondence with GODM grammar subset, providing a fundamental support for semantic integration and knowledge discovery of heterogeneous data. After that, comprehensive comparisons with GODM and other models are made, especially the distinctions between GODM and OWL on the aspects of relation mechanism, hybrid schema, description logic, grammatical constructors, etc. Besides, experimental evaluations and analyses on time and space efficiencies of some primary common data models are conducted after the proposal of a general evaluation model, with the results showing that the GODM model has great advantage on properties of expressiveness, flexibility, etc, particularly time and space efficiency. In summary, the GODM model describes heterogeneous data from both aspects of data structure and semantic relationship and realizes a hybrid schema reconciling the schemaful and schemaless data models, making it especially suitable for dynamic data integration and knowledge discovery from big data models. KEYWORDSdata integration, data model, GODM, hybrid schema, knowledge representation, NoSQL, time and space efficiency INTRODUCTIONThe development of cloud computing and big data technologies prompts data to expand dramatically and uncontrollably in size, structure, and format. To eliminate data heterogeneity, a unified mechanism is necessary to provide consistent data manipulation approaches and tap the value of mass data in maximum. According to the implementation technique and application requirement, data integration approaches are generally classified into a materialized approach and a virtual approach, 1 where the materialized approaches mainly refer to data warehousing, 2 and the virtual approaches have two sub-branches: structural approaches and semantic approaches. 1 The warehousing methods preprocess data before integration, and all the integrated data are loaded in the warehouse to guarantee the efficiency of a data query. However, this kind of methods always lead Concurrency Computat Pract Exper. 2018;30:e4422. wileyonlinelibrary.com/j...
It is ubiquitous that multiple jobs coexist on the same machine, because tens or hundreds of cores are able to reside on the same chip. To run multiple jobs efficiently, the schedulers should provide flexible scheduling logic. Besides, corunning jobs may compete for the shared resources, which may lead to performance degradation. While many scheduling algorithms have been proposed for supporting different scheduling logic schemes and alleviating this contention, job coscheduling without performance degradation on the same machine remains a challenging problem. In this paper, we propose a novel adaptive deadlock-free scheduler, which provides flexible scheduling logic schemes and adopts optimistic lock control mechanism to coordinate resource competition among corunning jobs. This scheduler exposes all underlying resource information to corunning jobs and gives them necessary utensils to make use of that information to compete resource in a free-for-all manner. To further relieve performance degradation of coscheduling, this scheduler enables the automated control over the number of active utensils when frequent conflict becomes the performance bottleneck. We justify our adaptive deadlock-free scheduling and present simulation results for synthetic and real-world workloads, in which we compare our proposed scheduler with two prevalent schedulers. It indicates that our proposed approach outperforms the compared schedulers in scheduling efficiency and scalability. Our results also manifest that the adaptive deadlock-free control facilitates significant improvements on the parallelism of node-level scheduling and the performance for workloads.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.