Neural Networks as Model Selection with Incremental MDL Normalization

Lin, Baihan

doi:10.1007/978-981-15-1398-5_14

Cited by 2 publications

(4 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To put this direction in perspective, we mention that this line of work is a series of three. The entry point is [16], where we propose a perspective to understand neural network optimization as a partially observable model selection problem. In our subsequent work [18], we introduce the details of how to approximate the minimum description length (MDL) between neural network layers and demonstrate that using MDL as the regularity information is useful, from an engineering angle, for neural networks to learn from certain input data distributions.…”

Section: Resultsmentioning

confidence: 99%

“…We introduce higher-order simplicial structure as a new summary statistic, and discover that these networks contain an abundance of cliques of single-cell profiles bound into cavities that guide the emergence of more complicated habitation forms [15,19]. To further disentangle information flow, we develop a mathematical filtration technique to compute nerve balls in a dual metric of space and time [20] and a information-theoretical measure among network modules [16,18]. This work aims to solve the following two analytical problems.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Geometric and Topological Inference for Deep Representations of Complex Networks

Lin

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Understanding the deep representations of complex networks is an important step of building interpretable and trustworthy machine learning applications in the age of internet. Global surrogate models that approximate the predictions of a black box model (e.g. an artificial or biological neural net) are usually used to provide valuable theoretical insights for the model interpretability. In order to evaluate how well a surrogate model can account for the representation in another model, we need to develop inference methods for model comparison. Previous studies have compared models and brains in terms of their representational geometries (characterized by the matrix of distances between representations of the input patterns in a model layer or cortical area). In this study, we propose to explore these summary statistical descriptions of representations in models and brains as part of a broader class of statistics that emphasize the topology as well as the geometry of representations. The topological summary statistics build on topological data analysis (TDA) and other graph-based methods. We evaluate these statistics in terms of the sensitivity and specificity that they afford when used for model selection, with the goal to relate different neural network models to each other and to make inferences about the computational mechanism that might best account for a black box representation. These new methods enable brain and computer scientists to visualize the dynamic representational transformations learned by brains and models, and to perform model-comparative statistical inference.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Geometric and Topological Inference for Deep Representations of Complex Networks

Lin

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…In the neural network setting where the optimization process is performed in batches (as incremental data sample

with i denoting the batch i ), the model selection process is formulated as a partially observable problem (as in Figure 3 and [ 5 ]). The generative model

is the function parameterized by

that maps from

(input layer

) to

(the activations of layer k ) at training time i .…”

Section: Neural Network As Model Selectionmentioning

confidence: 99%

“…In this paper, we adopt a similar definition of implicit space as in [ 4 ], but extend it beyond unsupervised learning, into a generic neural network optimization problem in both supervised and unsupervised settings [ 5 ]. In addition, we consider the formulation and computation of description length differently.…”

Section: Introductionmentioning

confidence: 99%

Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers

Lin

2021

Entropy

Self Cite

View full text Add to dashboard Cite

Inspired by the adaptation phenomenon of neuronal firing, we propose the regularity normalization (RN) as an unsupervised attention mechanism (UAM) which computes the statistical regularity in the implicit space of neural networks under the Minimum Description Length (MDL) principle. Treating the neural network optimization process as a partially observable model selection problem, the regularity normalization constrains the implicit space by a normalization factor, the universal code length. We compute this universal code incrementally across neural network layers and demonstrate the flexibility to include data priors such as top-down attention and other oracle information. Empirically, our approach outperforms existing normalization methods in tackling limited, imbalanced and non-stationary input distribution in image classification, classic control, procedurally-generated reinforcement learning, generative modeling, handwriting generation and question answering tasks with various neural network architectures. Lastly, the unsupervised attention mechanisms is a useful probing tool for neural networks by tracking the dependency and critical learning stages across layers and recurrent time steps of deep networks.

show abstract

Neural Networks as Model Selection with Incremental MDL Normalization

Cited by 2 publications

References 11 publications

Geometric and Topological Inference for Deep Representations of Complex Networks

Geometric and Topological Inference for Deep Representations of Complex Networks

Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers

Contact Info

Product

Resources

About