2017 International Joint Conference on Neural Networks (IJCNN) 2017
DOI: 10.1109/ijcnn.2017.7966331
|View full text |Cite
|
Sign up to set email alerts
|

Truncated variational EM for semi-supervised neural simpletrons

Abstract: Inference and learning for probabilistic generative networks is often very challenging and typically prevents scalability to as large networks as used for deep discriminative approaches. To obtain efficiently trainable, large-scale and well performing generative networks for semi-supervised learning, we here combine two recent developments: a neural network reformulation of hierarchical Poisson mixtures (Neural Simpletrons), and a novel truncated variational EM approach (TV-EM). TV-EM provides theoretical guar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(12 citation statements)
references
References 21 publications
0
12
0
Order By: Relevance
“…Previous work based on a fully probabilistic description of the Hebbian-learning network model (Forster et al, 2016 ; Forster and Lücke, 2017 ) shows that local Hebbian learning converges to the weight matrix B without requiring the non-local summation over k . This is true also when using a small fraction (≈1%) of labeled training examples.…”
Section: Methodsmentioning
confidence: 99%
“…Previous work based on a fully probabilistic description of the Hebbian-learning network model (Forster et al, 2016 ; Forster and Lücke, 2017 ) shows that local Hebbian learning converges to the weight matrix B without requiring the non-local summation over k . This is true also when using a small fraction (≈1%) of labeled training examples.…”
Section: Methodsmentioning
confidence: 99%
“…For all the above applications, our theoretical results show that the free energy (31) is the underlying objective function which is maximized. For the algorithms (Hughes and Sudderth, 2016;Forster and Lücke, 2017b) the TV-EM application to mixture models, furthermore, warrants that the free energy is provably monotonically increased, which follows from Prop. 5 and has not been shown previously.…”
Section: Tv-em For Mixture Modelsmentioning
confidence: 96%
“…The main motivation and focus of the previous truncated approximations for mixture models (Hughes and Sudderth, 2016;Forster and Lücke, 2017b) was the increase of efficiency. The source for the reduction of computational efforts were hereby the hard zeros introduced by truncated posteriors, which significantly reduced the required number of numerical operations in the M-step.…”
Section: Tv-em For Mixture Modelsmentioning
confidence: 99%
“…For GMM clustering with isotropic clusters, this means disregarding clusters distant from a given data point [48]. Such neglection ideas have, also more generally, been observed to reduce computational demands for probabilistic clustering approaches [25], [27], [49], [50], [51] as well as for deterministic approaches, such as k-means or agglomerative clustering, e.g., [6], [7]. For k-means, e.g.…”
Section: Related Workmentioning
confidence: 99%