Enhancing the performance of the DDBs (Distributed Database system) can be done by speeding up the computation of the data allocation, leading to higher speed allocation decisions and resulting in smaller data redundancy and shorter processing time. This paper deals with an integrated method for grouping the distributed sites into clusters and customizing the database fragments allocation to the clusters and their sites. We design a high speed clustering and allocating method to determine which fragments would be allocated to which cluster and site so as to maintain data availability and a constant systemic reliability, and evaluate the performance achieved by this method and demonstrate its efficiency by means of tabular and graphical representation. We tested our method over different network sites and found it reduces the data transferred between the sites during the execution time, minimizes the communication cost needed for processing applications, and handles the database queries and meets their future needs.
Clustering network sites is a vital issue in parallel and distributed database systems DDBS. Grouping distributed database network sites into clusters is considered an efficient way to minimize the communication time required for query processing. However, clustering network sites is still an open research problem since its optimal solution is NP-complete. The main contribution in this field is to find a near optimal solution that groups distributed database network sites into disjoint clusters in order to minimize the communication time required for data allocation. Grouping a large number of network sites into a small number of clusters effectively increases the transaction response time, results in better data distribution, and improves the distributed database system performance. We present a novel algorithm for clustering distributed database network sites based on the communication time as database query processing is time dependent. Extensive experimental tests and simulations are conducted on this clustering algorithm. The experimental and simulation results show that a better network distribution is achieved with significant network servers load balance and network delay, a minor communication time between network sites is realized, and a higher distributed database system performance is recognized.
The efficiency and performance of Distributed Database Management Systems (DDBMS) is mainly measured by its proper design and by network communication cost between sites. Fragmentation and distribution of data are the major design issues of the DDBMS. In this paper, we propose new approach that integrates both fragmentation and data allocation in one strategy based on high performance clustering technique and transaction processing cost functions. This new approach achieves efficiently and effectively the objectives of data fragmentation, data allocation and network sites clustering. The approach splits the data relations into pair-wise disjoint fragments and determine whether each fragment has to be allocated or not in the network sites, where allocation benefit outweighs the cost depending on high performance clustering technique. To show the performance of the proposed approach, we performed experimental studies on real database application at different networks connectivity. The obtained results proved to achieve minimum total data transaction costs between different sites, reduced the amount of redundant data to be accessed between these sites and improved the overall DDBMS performance.
The need for effective approaches to handle big data that is characterized by its large volume, different types, and high velocity is vital and hence has recently attracted the attention of several research groups. This is especially the case when traditional data processing techniques and capabilities proved to be insufficient in that regard. Another aspect that is equally important while processing big data is its security, as emphasized in this paper. Accordingly, we propose to process big data in two different tiers. The first tier classifies the data based on its structure and on whether security is required or not. In contrast, the second tier analyzes and processes the data based on volume, variety, and velocity factors. Simulation results demonstrated that using classification feedback from a MPLS/GMPLS core network proved to be key in reducing the data evaluation and processing time.
The expand trend of cloud data mobility led to malicious data threats that necessitate using data protection techniques. Most cloud system applications contain valuable and confidential data, such as personal, trade, or health information. Threats on such data may put the cloud systems that hold these data at high risk. However, traditional security solutions are not capable of handling the security of big data mobility. The current security mechanisms are insufficient for big data due to their shortage of determining the data that should be protected or due to their intractable time complexity. Therefore, the demand for securing mobile big data has been increasing rapidly to avoid any potential risks. This paper proposes an integrated methodology to classify and secure big data before executing data mobility, duplication, and analysis. The necessity of securing big data mobility is determined by classifying the data according to the risk impact level of their contents into two categories; confidential and public. Based on the classification category, the impact of data security is studied and substantiated on the confidential data in the scope of Hadoop Distributed File System. It is revealed that the proposed approach can significantly improve the cloud systems data mobility.INDEX TERMS Big data classification, data security, metadata, risk impact level, HDFS.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.