2014
DOI: 10.1063/1.4903703
|View full text |Cite
|
Sign up to set email alerts
|

Convex foundations for generalized MaxEnt models

Abstract: Abstract. We present an approach to maximum entropy models that highlights the convex geometry and duality of GEFs and their connection to Bregman divergences. Using our framework, we are able to resolve a puzzling aspect of the bijection of [1] between classical exponential families and what they call regular Bregman divergences. Their regularity condition rules out all but Bregman divergences generated from log-convex generators. We recover their bijection and show that a much broader class of divergences co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(15 citation statements)
references
References 17 publications
0
15
0
Order By: Relevance
“…The form of ρ t is to be compared to the generalized exponential family in [24] and the generalized MaxEnt models of [21].…”
Section: Comparison Of the Bound On An Example In The Continuous Casementioning
confidence: 99%
“…The form of ρ t is to be compared to the generalized exponential family in [24] and the generalized MaxEnt models of [21].…”
Section: Comparison Of the Bound On An Example In The Continuous Casementioning
confidence: 99%
“…where θ ∈ R d is a vector of (possibly negative) prediction scores produced by a model f W (x) and p ∈ d is a discrete probability distribution. It is a generalized exponential family distribution (Grünwald and Dawid, 2004;Frongillo and Reid, 2014) with natural parameter θ ∈ R d and regularization Ω. Of particular interest is the case where y Ω (θ) is sparse, meaning that there are scores θ for which the resulting y Ω (θ) assigns zero probability to some classes.…”
Section: Probabilistic Prediction With Fenchel-young Lossesmentioning
confidence: 99%
“…In this section, we discuss the concept of probability distribution regularization. Our treatment follows closely the variational formulations of exponential families (Barndorff-Nielsen, 1978;Wainwright and Jordan, 2008) and generalized exponential families (Grünwald and Dawid, 2004;Frongillo and Reid, 2014) but adopts the novel viewpoint of regularized prediction functions. We discuss two instances of that framework: the structured counterpart of the softmax, marginal inference, and a new structured counterpart of the sparsemax, structured sparsemax.…”
Section: Distribution Regularization Marginal Inference and Structure...mentioning
confidence: 99%
“…In fact, Theorem 2 is so important that we state and prove a generalization of it in appendix, Section 9, showing that dropping the "same family" constraint does not change the f -divergence (information-theoretic) vs Bregman divergence (information-geometric) picture. We now define generalizations of exponential families, following [6,23]. Let χ : R + → R + be non-decreasing [41,Chapter 10].…”
Section: Definitionsmentioning
confidence: 99%
“…To our knowledge, there is no previously known "GAN-amenable" generalization of this identity above exponential families. Related identities have recently been proven for two generalizations of exponential families [6,Theorem 9], [23,Theorem 3], but fall short of the f -divergence formulation and are not amenable to the variational GAN formulation.…”
Section: Introductionmentioning
confidence: 99%