Chaitali Choudhary scite author profile

Chaitali Choudhary

5Publications

35Citation Statements Received

177Citation Statements Given

How they've been cited

How they cite others

114

177

Affiliations

University of Petroleum and Energy Studies, Indian Institute of Technology Bhilai, University of Washington Tacoma

Publications

Order By: Most citations

MobDBTest: A machine learning based system for predicting diabetes risk using mobile devices

Sowjanya

Singhal

Choudhary

2015

View full text Add to dashboard Cite

Diabetes mellitus (DM) is reaching possibly epidemic proportions in India. The degree of disease and destruction due to diabetes and its potential complications are enormous, and originated a significant health care burden on both households and society. The concerning factor is that diabetes is now being proven to be linked with a number of complications and to be occurring at a comparatively younger age in the country. In India, the migration of people from rural to urban areas and corresponding modification in lifestyle are all moving the degree of diabetes. Deficiency of knowledge about diabetes causes untimely death among the population at large. Therefore, acquiring a proficiency that should spread awareness about diabetes may affect the people in India. In this work, a mobile/android application based solution to overcome the deficiency of awareness about diabetes has been shown. The application uses novel machine learning techniques to predict diabetes levels for the users. At the same time, the system also provides knowledge about diabetes and some suggestions on the disease. A comparative analysis of four machine learning (ML) algorithms were performed. The Decision Tree (DT) classifier outperforms amongst the 4 ML algorithms. Hence, DT classifier is used to design the machinery for the mobile application for diabetes prediction using real world dataset collected from a reputed hospital in the Chhattisgarh state of India.

show abstract

Privacy-preserving training of tree ensembles over continuous data

Adams

Choudhary

Cock

et al. 2022

View full text Add to dashboard Cite

Most existing Secure Multi-Party Computation (MPC) protocols for privacy-preserving training of decision trees over distributed data assume that the features are categorical. In real-life applications, features are often numerical. The standard “in the clear” algorithm to grow decision trees on data with continuous values requires sorting of training examples for each feature in the quest for an optimal cut-point in the range of feature values in each node. Sorting is an expensive operation in MPC, hence finding secure protocols that avoid such an expensive step is a relevant problem in privacy-preserving machine learning. In this paper we propose three more efficient alternatives for secure training of decision tree based models on data with continuous features, namely: (1) secure discretization of the data, followed by secure training of a decision tree over the discretized data; (2) secure discretization of the data, followed by secure training of a random forest over the discretized data; and (3) secure training of extremely randomized trees (“extra-trees”) on the original data. Approaches (2) and (3) both involve randomizing feature choices. In addition, in approach (3) cut-points are chosen randomly as well, thereby alleviating the need to sort or to discretize the data up front. We implemented all proposed solutions in the semi-honest setting with additive secret sharing based MPC. In addition to mathematically proving that all proposed approaches are correct and secure, we experimentally evaluated and compared them in terms of classification accuracy and runtime. We privately train tree ensembles over data sets with thousands of instances or features in a few minutes, with accuracies that are at par with those obtained in the clear. This makes our solution more efficient than the existing approaches, which are based on oblivious sorting.

show abstract

Privacy-Preserving Training of Tree Ensembles over Continuous Data

Adams¹,

Choudhary²,

Cock³

et al. 2021

Preprint

View full text Add to dashboard Cite

Most existing Secure Multi-Party Computation (MPC) protocols for privacy-preserving training of decision trees over distributed data assume that the features are categorical. In real-life applications, features are often numerical. The standard "in the clear" algorithm to grow decision trees on data with continuous values requires sorting of training examples for each feature in the quest for an optimal cut-point in the range of feature values in each node. Sorting is an expensive operation in MPC, hence finding secure protocols that avoid such an expensive step is a relevant problem in privacy-preserving machine learning. In this paper we propose three more efficient alternatives for secure training of decision tree based models on data with continuous features, namely: (1) secure discretization of the data, followed by secure training of a decision tree over the discretized data; (2) secure discretization of the data, followed by secure training of a random forest over the discretized data; and (3) secure training of extremely randomized trees ("extratrees") on the original data. Approaches ( 2) and (3) both involve randomizing feature choices. In addition, in approach (3) cutpoints are chosen randomly as well, thereby alleviating the need to sort or to discretize the data up front. We implemented all proposed solutions in the semi-honest setting with additive secret sharing based MPC. In addition to mathematically proving that all proposed approaches are correct and secure, we experimentally evaluated and compared them in terms of classification accuracy and runtime. We privately train tree ensembles over data sets with 1000s of instances or features in a few minutes, with accuracies that are at par with those obtained in the clear. This makes our solution orders of magnitude more efficient than the existing approaches, which are based on oblivious sorting.

show abstract

Improving the Efficiency of Ensemble Classifier Adaptive Random Forest with Meta Level Learning for Real-Time Data Streams

Arya

Choudhary

2020

View full text Add to dashboard Cite

SARWAS: Deep ensemble learning techniques for sentiment based recommendation system

Choudhary

Singh

Kumar

2023

Expert Systems with Applications

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.