On Bregman Distances and Divergences of Probability Measures

Stummer, Wolfgang; Vajda, Igor

doi:10.1109/tit.2011.2178139

Cited by 46 publications

(53 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another setting arises when higher dimensional aspects are compressed into real‐valued functionals, such as in the case of ϕ ‐divergences of probability measures P and Q . In a nutshell, a ϕ ‐divergence is of the form

D_{italicϕ} false(P, Q false) = \int ϕ (\frac{normald P}{normald Q}) d Q,

where ϕ is a convex function on [0, ∞) such that ϕ (1)=0; see, for example Stummer and Vajda () for a recent treatment and a delineation from Bregman distances. Any function ϕ of this form admits a mixture representation in terms of elementary functions or atoms

k_{italicθ} false(r false) = false| r - italicθ false| 1 false(r \land 1 ⩽ italicθ < r \lor 1 false)

for θ >0.…”

Section: Choquet Representationsmentioning

confidence: 99%

Of Quantiles and Expectiles: Consistent Scoring Functions, Choquet Representations and Forecast Rankings

Ehm

Gneiting

Jordan³

et al. 2016

Journal of the Royal Statistical Society Series B: Statistical Methodology

148

205

View full text Add to dashboard Cite

Summary. In the practice of point prediction, it is desirable that forecasters receive a directive in the form of a statistical functional. For example, forecasters might be asked to report the mean or a quantile of their predictive distributions. When evaluating and comparing competing forecasts, it is then critical that the scoring function used for these purposes be consistent for the functional at hand, in the sense that the expected score is minimized when following the directive. We show that any scoring function that is consistent for a quantile or an expectile functional can be represented as a mixture of elementary or extremal scoring functions that form a linearly parameterized family. Scoring functions for the mean value and probability forecasts of binary events constitute important examples. The extremal scoring functions admit appealing economic interpretations of quantiles and expectiles in the context of betting and investment problems. The Choquet-type mixture representations give rise to simple checks of whether a forecast dominates another in the sense that it is preferable under any consistent scoring function. In empirical settings it suffices to compare the average scores for only a finite number of extremal elements. Plots of the average scores with respect to the extremal scoring functions, which we call Murphy diagrams, permit detailed comparisons of the relative merits of competing forecasts.

show abstract

D_{italicϕ} false(P, Q false) = \int ϕ (\frac{normald P}{normald Q}) d Q,

k_{italicθ} false(r false) = false| r - italicθ false| 1 false(r \land 1 ⩽ italicθ < r \lor 1 false)

for θ >0.…”

Section: Choquet Representationsmentioning

confidence: 99%

Of Quantiles and Expectiles: Consistent Scoring Functions, Choquet Representations and Forecast Rankings

Ehm

Gneiting

Jordan³

et al. 2016

Journal of the Royal Statistical Society Series B: Statistical Methodology

148

205

View full text Add to dashboard Cite

show abstract

“…To embed the abovementioned two major divergence classes as a special case of Definition , let us first mention Stummer and Stummer and Vajda have shown that in case of ϕ (1) = 0 for the choice m ( y ): = q ( y ) (

y \in scriptY

) the scaled Bregman distance becomes

B_{ϕ} false(P, Q false‖ Q false) = \int_{Y} q false(y false) \cdot ϕ ((), \frac{p false(y false)}{q false(y false)}) normald λ false(y false) = : D_{ϕ}^{C A S} false(P, Q false),

which is nothing but the well‐known ϕ ‐divergence between P and Q (and on the right‐hand side, one has to additionally subtract ϕ (1) in case that it is not zero). The latter has been first studied by Csiszár and Ali and Silvey .…”

Section: The Divergence Frameworkmentioning

confidence: 99%

A new toolkit for robust distributional change detection

Kißlinger

Stummer

2018

Appl Stoch Models Bus & Ind

Self Cite

View full text Add to dashboard Cite

Divergences (distances), which measure the dissimilarity, respectively, proximity, between two probability distributions, have turned out to be very useful for several different tasks in statistics (eg, parameter estimation and goodness‐of‐fit testing), econometrics, machine learning, information theory, etc. Some prominent examples are the Kullback‐Leibler information (relative entropy), the Csiszár‐Ali‐Silvey ϕ‐divergences, the “ordinary” (ie, unscaled) Bregman divergences, and the recently developed more general scaled Bregman divergences. Out of the latter and a novel extension to nonconvex generators, we form a new toolkit for detecting distributional changes in random data (streams and clouds). Some sample‐size asymptotics is investigated as well.

show abstract

“…Scaled Bregman divergences, formally introduced by Stummer [24] and Stummer and Vajda [25], unify separable Bregman divergences [10] (defined below in Section 2.3) and f-divergences [8,9]. This paper uses scaled Bregman divergences as its basis, and accomplishes the following objectives:…”

Section: Goal Of This Papermentioning

confidence: 99%

“…Definition 2 (Notations) [25]: M denotes the space of all finite measures on a measurable space (X , A) and P ⊂ M the subspace of all probability measures. Unless otherwise explicitly stated P,R,M are mutually measure-theoretically equivalent measures on (X , A) dominated by a σ-finite measure λ on (X , A).…”

Section: Bregman Divergences and Scaled Bregman Divergencesmentioning

confidence: 99%

See 1 more Smart Citation

Scaled Bregman divergences in a Tsallis scenario

Venkatesan¹,

Plastino²

2011

Physica A: Statistical Mechanics and its Applications

View full text Add to dashboard Cite

There exist two different versions of the Kullback-Leibler divergence (K-Ld) in Tsallis statistics, namely the usual generalized K-Ld and the generalized Bregman K-Ld. Problems have been encountered in trying to reconcile them. A condition for consistency between these two generalized K-Ld-forms by recourse to the additive duality of Tsallis statistics is derived. It is also shown that the usual generalized K-Ld subjected to this additive duality, known as the dual generalized K-Ld, is a scaled Bregman divergence. This leads to an interesting conclusion: the dual generalized mutual information is a scaled Bregman information. The utility and implications of these results are discussed.

show abstract

On Bregman Distances and Divergences of Probability Measures

Cited by 46 publications

References 34 publications

Of Quantiles and Expectiles: Consistent Scoring Functions, Choquet Representations and Forecast Rankings

Of Quantiles and Expectiles: Consistent Scoring Functions, Choquet Representations and Forecast Rankings

A new toolkit for robust distributional change detection

Scaled Bregman divergences in a Tsallis scenario

Contact Info

Product

Resources

About