Abstract-Analytical business applications generate reports that give a trend predicting insight into the organization's future, estimating the financial graphs and risk factors. These applications work on huge amounts of data, which comprises of decades of market and company records, and decision logs of an organization. Today, limit of big data is touching zeta-bytes and the structured data makes only 20% of today's data. 20% of a giga-byte can be ignorable in comparison to big data but 20% of big data itself cannot be neglected. Traditional data management tools are like step-dads when it comes to running cross table analytical queries on structured data in distributed processing environment; response time to these data management tools are high because of the ill-aligned data sets and complex hierarchy of distributed computing environment. Data alignment requires a complete shift in data deployment paradigm from row oriented storage layout to column oriented storage layout, and complex hierarchy of distributed computing environment can be handled by keeping metadata of entire data set. Paper proposes an approach to ease the deployment of structured data into the distributed processing environment by arranging data into column-wise combinational entities. Response time to analytical queries can be lowered with the support of two concepts; Shared architecture and Multi path query execution. Highly scalable systems are Shared Nothing architecture based but degradation in performance and fault tolerance are the side effects that came with high scalability. Proposed method is an effort to balance the equation between scalability, performance and fault tolerance. And due to the limited scope of this paper we concentrate on issues and solutions for structured data only. Shared architecture and active backup helps improving the system's performance by sharing the work-load-per-node. BSPF's clustering methodology sheds the data pressure points to minimize the data loss per node crash.
Abstract-Data integration provides a uniform view of a set of heterogeneous data sources and facilitates users to query without any knowledge of the underlying heterogeneous data sources. In current era, Service Oriented Architecture and Cloud computing together has enabled users to access services over the Internet at a low cost. Cloud computing model provides a layer which is responsible for providing data to the other layers and services i.e., Data as a service (DaaS) layer. The issue of providing an integrated view of data can be handled using Semantic data; the data stored in a way that is understandable by machines and integratable without human intervention. However, integrating data using semantic web technology without enforcing any access management will raise privacy and confidentiality concerns. Different data owners store data in heterogeneous format based on their requirements. This leads to the data interoperability problem. This research proposes a framework that would allow data from different sources to be integrated using the concept of semantic data, thus resolving the issue of interoperability and also devises an access control system for defining explicit privacy constraints.Index Terms-Access control management, data as a service (DaaS), data integration, interoperability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.