Learning from Randomly Initialized Neural Network Features

Amid, Ehsan; Anil, Rohan; Kotłowski, Wojciech; Warmuth, Manfred K.

doi:10.48550/arxiv.2202.06438

Cited by 4 publications

(4 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Here we list two reasons based on the observations from previous work. First randomly initialized networks are reported to produce powerful representations for multiple computer vision tasks (Saxe et al, 2011;Cao et al, 2018;Amid et al, 2022). Second, such random networks are showed to perform a distancepreserving embedding of the data, i.e.…”

Section: Discussionmentioning

confidence: 99%

Dataset Condensation with Distribution Matching

Zhao

Bilen

2023

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

114

View full text Add to dashboard Cite

Computational cost of training state-of-the-art deep models in many learning problems is rapidly increasing due to more sophisticated models and larger datasets. A recent promising direction for reducing training cost is dataset condensation that aims to replace the original large training set with a significantly smaller learned synthetic set while preserving the original information. While training deep models on the small set of condensed images can be extremely fast, their synthesis remains computationally expensive due to the complex bi-level optimization and second-order derivative computation. In this work, we propose a simple yet effective method that synthesizes condensed images by matching feature distributions of the synthetic and original training images in many sampled embedding spaces. Our method significantly reduces the synthesis cost while achieving comparable or better performance. Thanks to its efficiency, we apply our method to more realistic and larger datasets with sophisticated neural architectures and obtain a significant performance boost 1 . We also show promising practical benefits of our method in continual learning and neural architecture search.

show abstract

Section: Discussionmentioning

confidence: 99%

Dataset Condensation with Distribution Matching

Zhao

Bilen

2023

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

114

View full text Add to dashboard Cite

show abstract

“…The NPCK bears a close similarity to the neural network prior kernel (NNPK), which is defined by Amid et al (2022) as…”

Section: Neural Posterior Correlation Kernelmentioning

confidence: 99%

The locus coeruleus broadcasts prediction errors across the cortex to promote sensorimotor plasticity

Jordan

Keller

2023

Preprint

View full text Add to dashboard Cite

Prediction errors are differences between expected and actual sensory input and are thought to be key computational signals that drive learning related plasticity. One way that prediction errors could drive learning is by activating neuromodulatory systems to gate plasticity. The catecholaminergic locus coeruleus (LC) is a major neuromodulatory system involved in neuronal plasticity in the cortex. Using two-photon calcium imaging in mice exploring a virtual environment, we found that the activity of LC axons in the cortex correlated with the magnitude of unsigned visuomotor prediction errors. LC response profiles were similar in both motor and visual cortical areas, indicating that LC axons broadcast prediction errors throughout the dorsal cortex. While imaging calcium activity in layer 2/3 of the primary visual cortex, we found that optogenetic stimulation of LC axons facilitated learning of a stimulus-specific suppression of visual responses during locomotion. This plasticity – induced by minutes of LC stimulation – recapitulated the effect of visuomotor learning on a scale that is normally observed during visuomotor development across days. We conclude that prediction errors drive LC activity, and that LC activity facilitates sensorimotor plasticity in the cortex, consistent with a role in modulating learning rates.

show abstract

“…ii) Randomly initialized models are not valid embedding functions for the Maximum Mean Discrepancy (MMD) [18] estimation used in DM. Specifically, DM justifies the validity of randomly-initialized models by their intrinsic classification power observed in tasks such as deep clustering [3,5,6,36]. However, we believe that this does not apply to DM as randomly-initialized models do not satisfy the requirement of embedding functions used in MMD and makes it an invalid measure of distribution distance.…”

Section: Optimization-oriented Methodsmentioning

confidence: 98%

Improved Distribution Matching for Dataset Condensation

Zhao,

Li,

Qin

et al. 2023

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Dataset Condensation aims to condense a large dataset into a smaller one while maintaining its ability to train a well-performing model, thus reducing the storage cost and training effort in deep learning applications. However, conventional dataset condensation methods are optimizationoriented and condense the dataset by performing gradient or parameter matching during model optimization, which is computationally intensive even on small datasets and models. In this paper, we propose a novel dataset condensation method based on distribution matching, which is more efficient and promising. Specifically, we identify two important shortcomings of naive distribution matching (i.e., imbalanced feature numbers and unvalidated embeddings for distance computation) and address them with three novel techniques (i.e., partitioning and expansion augmentation, efficient and enriched model sampling, and class-aware distribution regularization). Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources, thereby scaling data condensation to larger datasets and models. Extensive experiments demonstrate the effectiveness of our method. Codes are available at https://github. com/uitrbn/IDM

show abstract

Learning from Randomly Initialized Neural Network Features

Cited by 4 publications

References 33 publications

Dataset Condensation with Distribution Matching

Dataset Condensation with Distribution Matching

The locus coeruleus broadcasts prediction errors across the cortex to promote sensorimotor plasticity

Improved Distribution Matching for Dataset Condensation

Contact Info

Product

Resources

About