In the field of information security, privacy protection based on machine learning is currently a hot topic. Combining differential privacy protection with AdaBoost, a machine learning ensemble classification algorithm, this paper proposes a scheme under differential privacy named CART-DPsAdaBoost (CART-Differential privacy structure of AdaBoost). In the process of boosting, the algorithm combines the idea of bagging, and uses a classification and regression tree (CART) stump as the base learner for ensemble learning. Applying feature perturbation, based on a random subspace algorithm, the exponential mechanism is used to select the splitting point for continuous attributes. We use the Gini index to find the optimal binary partitioning point for discrete attributes and add noise according to the Laplace mechanism. Throughout the process, a privacy budget is allocated in order to meet the appropriate differential privacy protection needs for the current application. Unlike similar algorithms, this method does not require discretization during preprocessing of the data. Experimental results with the Census Income, Digit Recognizer, and Adult Data Set show that while protecting private information, the scheme has little impact on classification accuracy and can effectively address large-scale and high-dimensional data classification problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.