Manish Kesarwani scite author profile

¹

,

Mukhoty

²

,

Arya

³

et al. 2018

Cloud vendors are increasingly offering machine learning services as part of their platform and services portfolios. These services enable the deployment of machine learning models on the cloud that are offered on a pay-per-query basis to application developers and end users. However recent work has shown that the hosted models are susceptible to extraction attacks. Adversaries may launch queries to steal the model and compromise future query payments or privacy of the training data. In this work, we present a cloudbased extraction monitor that can quantify the extraction status of models by observing the query and response streams of both individual and colluding adversarial users. We present a novel technique that uses information gain to measure the model learning rate by users with increasing number of queries. Additionally, we present an alternate technique that maintains intelligent query summaries to measure the learning rate relative to the coverage of the input feature space in the presence of collusion. Both these approaches have low computational overhead and can easily be offered as services to model owners to warn them of possible extraction attacks from adversaries. We present performance results for these approaches for decision tree models deployed on BigML MLaaS platform, using open source datasets and different adversarial attack strategies.

Data Readiness Report

Afzal

¹

,

Rajmohan

²

,

³

et al. 2021

Secure and efficient wildcard search over encrypted data

Chatterjee

¹

,

²

,

Modi

³

et al. 2020

Int. J. Inf. Secur.

Secure k-Anonymization over Encrypted Databases

Kesarwani¹,

Kaul²,

Braghin³

et al. 2021

Preprint

Data protection algorithms are becoming increasingly important to support modern business needs for facilitating data sharing and data monetization. Anonymization is an important step before data sharing. Several organizations leverage on third parties for storing and managing data. However, third parties are often not trusted to store plaintext personal and sensitive data; data encryption is widely adopted to protect against intentional and unintentional attempts to read personal/sensitive data.Traditional encryption schemes do not support operations over the ciphertexts and thus anonymizing encrypted datasets is not feasible with current approaches. This paper explores the feasibility and depth of implementing a privacy-preserving data publishing workflow over encrypted datasets leveraging on homomorphic encryption. We demonstrate how we can achieve uniqueness discovery, data masking, differential privacy and k-anonymity over encrypted data requiring zero knowledge about the original values. We prove that the security protocols followed by our approach provide strong guarantees against inference attacks. Finally, we experimentally demonstrate the performance of our data publishing workflow components. I. INTRODUCTIONNowadays, applications interact with a plethora of potentially sensitive information from multiple sources. As an example, modern applications regularly combine data from different domains such as healthcare and IoT. While such rich sources of data are extremely valuable for analysts, researchers, marketers and other professionals, data privacy technologies and practices face several key challenges to keep pace.

Collusion-Resistant Processing of SQL Range Predicates

¹

,

Kaul

²

,

Singh

³

et al. 2018

Prior solutions for securely handling SQL range predicates in outsourced Cloud-resident databases have primarily focused on passive attacks in the Honest-but-Curious adversarial model, where the server is only permitted to observe the encrypted query processing. We consider here a significantly more powerful adversary, wherein the server can launch an active attack by clandestinely issuing specific range queries via collusion with a few compromised clients. The security requirement in this environment is that data values from a plaintext domain of size N should not be leaked to within an interval of size H. Unfortunately, all prior encryption schemes for range predicate evaluation are easily breached with only O(log 2) range queries, where = N∕H. To address this lacuna, we present SPLIT, a new encryption scheme where the adversary requires exponentially more-()range queries to breach the interval constraint and can therefore be easily detected by standard auditing mechanisms. The novel aspect of SPLIT is that each value appearing in a range-sensitive column is first segmented into two parts. These segmented parts are then independently encrypted using a layered composition of a secure block cipher with the order-preserving encryption and prefix-preserving encryption schemes, and the resulting ciphertexts are stored in separate tables. At query processing time, range predicates are rewritten into an equivalent set of table-specific sub-range predicates, and the disjoint union of their results forms the query answer. A detailed evaluation of SPLIT on benchmark database queries indicates that its execution times are well within a factor of two of the corresponding plaintext times, testifying its efficiency in resisting active adversaries.

Knowledge & Learning-based Adaptable System for Sensitive Information Identification and Handling

Kaul¹,

Kesarwani²,

Min³

et al. 2021

Preprint

Collusion-Resistant Processing of SQL Range Predicates

¹

,

Kaul

²

,

Singh

³

et al. 2018

Data Sci. Eng.

1

Prior solutions for securely handling SQL range predicates in outsourced Cloud-resident databases have primarily focused on passive attacks in the Honest-but-Curious adversarial model, where the server is only permitted to observe the encrypted query processing. We consider here a significantly more powerful adversary, wherein the server can launch an active attack by clandestinely issuing specific range queries via collusion with a few compromised clients. The security requirement in this environment is that data values from a plaintext domain of size N should not be leaked to within an interval of size H. Unfortunately, all prior encryption schemes for range predicate evaluation are easily breached with only O(log 2) range queries, where = N∕H. To address this lacuna, we present SPLIT, a new encryption scheme where the adversary requires exponentially more-()range queries to breach the interval constraint and can therefore be easily detected by standard auditing mechanisms. The novel aspect of SPLIT is that each value appearing in a range-sensitive column is first segmented into two parts. These segmented parts are then independently encrypted using a layered composition of a secure block cipher with the order-preserving encryption and prefix-preserving encryption schemes, and the resulting ciphertexts are stored in separate tables. At query processing time, range predicates are rewritten into an equivalent set of table-specific sub-range predicates, and the disjoint union of their results forms the query answer. A detailed evaluation of SPLIT on benchmark database queries indicates that its execution times are well within a factor of two of the corresponding plaintext times, testifying its efficiency in resisting active adversaries.