2011
DOI: 10.1109/tsp.2011.2144587
|View full text |Cite
|
Sign up to set email alerts
|

Variational Bayesian Learning of Probabilistic Discriminative Models With Latent Softmax Variables

Abstract: This paper presents new variational Bayes (VB) approximations for learning probabilistic discriminative models with latent softmax variables, such as subclass-based multimodal softmax and mixture of experts models. The VB approximations derived here lead to closed-form approximate parameter posteriors and suitable metrics for model selection. Unlike other Bayesian methods for this challenging class of models, the proposed VB methods require neither restrictive structural assumptions nor sampling approximations… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 19 publications
(5 citation statements)
references
References 30 publications
0
5
0
Order By: Relevance
“…After the vector representation with lexical weight is calculated through using the attention mechanism, it is spliced and input into a dimension reduction full connection layer and then classified by using the SoftMax function [12].…”
Section: A Topic Segmentation Model Based On Model Transfermentioning
confidence: 99%
“…After the vector representation with lexical weight is calculated through using the attention mechanism, it is spliced and input into a dimension reduction full connection layer and then classified by using the SoftMax function [12].…”
Section: A Topic Segmentation Model Based On Model Transfermentioning
confidence: 99%
“…It is assumed that the detection rate is high enough and ζ k is highly accurate so that observation uncertainty and data association ambiguity for this type of observation do not need to be considered. For the likelihood function p(ζ k |x), a 2D multi-modal softmax (MMS) model [35] shown in Fig. 7 is used, and weights and biases of the MMS model are calculated offline [26] and are modified online based on the rover's position and orientation.…”
Section: A Problem Setupmentioning
confidence: 99%
“…The conditional distribution for the unknown noise variance R k − 1 at time k − 1 could be approximated as Inverse-Gamma distribution, because the Inverse-Gamma distribution is the conjugate prior distribution for the variance of a Gaussian distribution [17]. The Inverse-Gamma distribution for the noise variance could be represented as…”
Section: Analytic Implementation Of the Extended Phd Filtermentioning
confidence: 99%
“…To solve this problem, the true posterior distribution v k|k (x k , R k |Z 1:k ) can be approximated by a distribution that is computationally tractable, which called VB approximation [17][18][19]. The key idea of VB is to find a tractable approximation to the true posterior density that minimises the Kullback-Leibler (KL) divergence.…”
Section: Propositionmentioning
confidence: 99%
See 1 more Smart Citation