The EU General Data Protection Regulation (GDPR) and the California Privacy Rights Act (CPRA) mandate the principle of data minimization, which requires that only data necessary to fulfill a certain purpose be collected. However, it can often be difficult to determine the minimal amount of data required, especially in complex machine learning models such as deep neural networks. We present a first-of-a-kind method to reduce the amount of personal data needed to perform predictions with a machine learning model, by removing or generalizing some of the input features of the runtime data. Our method makes use of the knowledge encoded within the model to produce a generalization that has little to no impact on its accuracy, based on knowledge distillation approaches. We show that, in some cases, less data may be collected while preserving the exact same level of model accuracy as before, and if a small deviation in accuracy is allowed, even more generalizations of the input features may be performed. We also demonstrate that when collecting the features dynamically, the generalizations may be even further improved. This method enables organizations to truly minimize the amount of data collected, thus fulfilling the data minimization requirement set out in the regulations.
Privacy-preserving solutions enable companies to offload confidential data to third-party services while fulfilling their government regulations. To accomplish this, they leverage various cryptographic techniques such as Homomorphic Encryption (HE), which allows performing computation on encrypted data. Most HE schemes work in a SIMD fashion, and the data packing method can dramatically affect the running time and memory costs. Finding a packing method that leads to an optimal performant implementation is a hard task. We present a simple and intuitive framework that abstracts the packing decision for the user. We explain its underlying data structures and optimizer, and propose a novel algorithm for performing 2D convolution operations. We used this framework to implement an inference operation over an encrypted HE-friendly AlexNet neural network with large inputs, which runs in around five minutes, several orders of magnitude faster than other state-of-the-art non-interactive HE solutions.
No abstract
Modern industrial systems now, more than ever, require secure and efficient ways of communication. The trend of making connected, smart architectures is beginning to show in various fields of the industry such as manufacturing and logistics. The number of IoT (Internet of Things) devices used in such systems is naturally increasing and industry leaders want to define business processes which are reliable, reproducible, and can be effortlessly monitored. With the rise in number of connected industrial systems, the number of used IoT devices also grows and with that some challenges arise. Cybersecurity in these types of systems is crucial for their wide adoption. Without safety in communication and threat detection and prevention techniques, it can be very difficult to use smart, connected systems in the industry setting. In this paper we describe two real-world examples of such systems while focusing on our architectural choices and lessons learned. We demonstrate our vision for implementing a connected industrial system with secure data flow and threat detection and mitigation strategies on real-world data and IoT devices. While our system is not an off-the-shelf product, our architecture design and results show advantages of using technologies such as Deep Learning for threat detection and Blockchain enhanced communication in industrial IoT systems and how these technologies can be implemented. We demonstrate empirical results of various components of our system and also the performance of our system as-a-whole. INDEX TERMSAnomaly Detection, Blockchain, Cybersecurity, Deep Learning, Internet of Things I. INTRODUCTIONDespite the fact that the IIoT (Industrial Internet of Things) has a profound impact on many industry domains, a major barrier towards IIoT adoption lies in cybersecurity issues that make it extremely difficult to harness its full potential: IIoT systems dramatically increase the attack surface
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.