2021
DOI: 10.48550/arxiv.2110.12770
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DP-XGBoost: Private Machine Learning at Scale

Abstract: The big-data revolution announced ten years ago [MCB + 11] does not seem to have fully happened at the expected scale [Ana16]. One of the main obstacle to this, has been the lack of data circulation. And one of the many reasons people and organizations did not share as much as expected is the privacy risk associated with data sharing operations.There has been many works on practical systems to compute statistical queries with Differential Privacy (DP). There have also been practical implementations of systems… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 14 publications
0
3
0
Order By: Relevance
“…For the rest of this paper, we present algorithms as if the data were held centrally, with the understanding that all the operations we use can be performed in the federated model (with rounding to fixed precision) 2 . This means that we avoid techniques designed for central evaluation such as the exponential mechanism [33,45,65]. Threat Model: In this work, in common with many other works in the federated setting, we assume an honest-but-curious model, where the clients do not trust others with their raw data.…”
Section: The Federated Model Of Computationmentioning
confidence: 99%
See 2 more Smart Citations
“…For the rest of this paper, we present algorithms as if the data were held centrally, with the understanding that all the operations we use can be performed in the federated model (with rounding to fixed precision) 2 . This means that we avoid techniques designed for central evaluation such as the exponential mechanism [33,45,65]. Threat Model: In this work, in common with many other works in the federated setting, we assume an honest-but-curious model, where the clients do not trust others with their raw data.…”
Section: The Federated Model Of Computationmentioning
confidence: 99%
“…While this is suitable in non-private settings, it is difficult to calculate such quantiles (or quantile sketches) accurately without incurring an appreciable privacy cost. Existing work on DP-GBDTs has computed split candidates either with LDP quantiles in the local setting [43], DP quantiles in the central setting [33] or with MPC methods (without DP guarantees) in distributed settings [61]. As we assume bounds on features are public knowledge, we do not need to query participants' data, and hence 𝜅 𝑐 = 0.…”
Section: Component 3: Generating Split Candidatesmentioning
confidence: 99%
See 1 more Smart Citation