Data security has become crucial to most enterprise and government applications due to the increasing amount of data generated, collected, and analyzed. Many algorithms have been developed to secure data storage and transmission. However, most existing solutions require multi-round functions to prevent differential and linear attacks. This results in longer execution times and greater memory consumption, which are not suitable for large datasets or delay-sensitive systems. To address these issues, this work proposes a novel algorithm that uses, on one hand, the reflection property of a balanced binary search tree data structure to minimize the overhead, and on the other hand, a dynamic offset to achieve a high security level. The performance and security of the proposed algorithm were compared to Advanced Encryption Standard and Data Encryption Standard symmetric encryption algorithms. The proposed algorithm achieved the lowest running time with comparable memory usage and satisfied the avalanche effect criterion with 50.1%. Furthermore, the randomness of the dynamic offset passed a series of National Institute of Standards and Technology (NIST) statistical tests.
The recent rapid rise in the availability of big data due to Internet-based technologies such as social media platforms and mobile devices has left many market leaders unprepared for handling very large, random and high velocity data. Conventionally, technologies are initially developed and tested in labs and appear to the public through media such as press releases and advertisements. These technologies are then adopted by the general public. In the case of big data technology, fast development and ready acceptance of big data by the user community has left little time to be scrutinized by the academic community. Although many books and electronic media articles are published by professionals and authors for their work on big data, there is still a lack of fundamental work in academic literature. Through survey methods, this paper discusses challenges in different aspects of big data, such as data sources, content format, data staging, data processing, and prevalent data stores. Issues and challenges related to big data, specifically privacy attacks and counter-techniques such as k-anonymity, t-closeness, l-diversity and differential privacy are discussed. Tools and techniques adopted by various organizations to store different types of big data are also highlighted. This study identifies different research areas to address such as a lack of anonymization techniques for unstructured big data, data traffic pattern determination for developing scalable data storage solutions and controlling mechanisms for high velocity data.
Organisations that collect and maintain individual data face the challenge of preserving privacy and security when using, archiving, or sharing these data. De-identification tools are essential for minimising the privacy risk. However, current data de-identification and anonymisation methods are widely used to alter the original data in a way that cannot be recovered. This results in data distortion and, hence, the substantial loss of knowledge within the data.To address this issue, this paper introduces the concept of reversible data deidentification methods to de-identify unstructured health data under the Health Insurance Portability and Accountability Act (HIPAA) guidelines. The model integrates Philter [9], the state-of-the-art tool for extracting personal identifiers from free-text, to detect confidential information and encrypt them with E-ART, lightweight encryption algorithm E-ART [10]. The performance of the proposed model ARTPHIL is evaluated using i2b2 data corpus in terms of recall, precision, F-measure and execution time. The results of the experiment are consistent with the recent de-identification method with recall of 96.93%. More importantly, the original data can be recovered, if needed, and authenticated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.