Jasper Tan scite author profile

A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overparameterization: the fact that overparameterized models are more vulnerable to privacy attacks, in particular the membership inference attack that predicts the (potentially sensitive) examples used to train a model. We significantly extend the relatively few empirical results on this problem by theoretically proving for an overparameterized linear regression model with Gaussian data that the membership inference vulnerability increases with the number of parameters. Moreover, a range of empirical studies indicates that more complex, nonlinear models exhibit the same behavior. Finally, we study different methods for mitigating such attacks in the overparameterized regime, such as noise addition and regularization, and conclude that simply reducing the parameters of an overparameterized model is an effective strategy to protect it from membership inference without greatly decreasing its generalization error.

show abstract

Face Detection and Verification Using Lensless Cameras

Tan

Niu

Adams

et al. 2019

IEEE Trans. Comput. Imaging

View full text Add to dashboard Cite

CANOPIC: Pre-Digital Privacy-Enhancing Encodings for Computer Vision

Tan

Khan

Boominathan

et al. 2020

View full text Add to dashboard Cite

MINER: Multiscale Implicit Neural Representations

Saragadam¹,

Tan²,

Balakrishnan³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

MINER: Multiscale Implicit Neural Representation

Saragadam

Tan

Balakrishnan

et al. 2022

View full text Add to dashboard Cite

Near-Linear-Phase IIR Filters Using Gauss-Newton Optimization

Tan

Burrus

2019

View full text Add to dashboard Cite

Wearing A Mask: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels

Alemohammad

Babaei

Balestriero

et al. 2021

View full text Add to dashboard Cite

High dimensionality poses many challenges to the use of data, from visualization and interpretation, to prediction and storage for historical preservation. Techniques abound to reduce the dimensionality of fixed-length sequences, yet these methods rarely generalize to variable-length sequences. To address this gap, we extend existing methods that rely on the use of kernels to variable-length sequences via use of the Recurrent Neural Tangent Kernel (RNTK). Since a deep neural network with ReLu activation is a Max-Affine Spline Operator (MASO), we dub our approach Max-Affine Spline Kernel (MASK). We demonstrate how MASK can be used to extend principal components analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) and apply these new algorithms to separate synthetic time series data sampled from second-order differential equations.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jasper Tan

Towards Photorealistic Reconstruction of Highly Multiplexed Lensless Images

Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference

Face Detection and Verification Using Lensless Cameras

CANOPIC: Pre-Digital Privacy-Enhancing Encodings for Computer Vision

MINER: Multiscale Implicit Neural Representations

MINER: Multiscale Implicit Neural Representation

Near-Linear-Phase IIR Filters Using Gauss-Newton Optimization

Wearing A Mask: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels

Contact Info

Product

Resources

About