Approximation, Metric Entropy and Small Ball Estimates for Gaussian Measures

Li, Wenbo V.; Linde, Werner

doi:10.1214/aop/1022677459

Cited by 156 publications

(105 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the proof (see ( Remark 3.5 (Rescaled Gaussian Priors). While the use of Gaussian process techniques [3,15,26] in the proof of Theorem 3.2 is inspired by previous work in [42,43] and also [17] for 'direct' problems, the inverse setting poses several challenges, particularly in the nonlinear case. In our proofs we show how these challenges can be overcome by shrinking common Gaussian process priors towards the origin as in (3.4)-the shrinkage enforces the necessary additional 'a priori' regularisation of the posterior distribution to permit the use of our stability estimates.…”

Section: Remarks and Discussionmentioning

confidence: 99%

“…Since our regression functions C take values in SO.n/, they are uniformly bounded and the usual Hellinger distance occurring in such contraction theorems is then Lipschitz-equivalent to the standard L 2 -distance (see Lemma 5.14). Then Lemma 5.16 uses results of [26] to show that the key small ball condition in Theorem 5.13 can be verified for the Gaussian priors from Condition 3.1 even after they have been shrunk towards 0, if the true matrix field 0 belongs to the RKHS H.…”

Section: Consistency Of the Posterior Mean: Proof Of Theorem 32mentioning

confidence: 99%

See 1 more Smart Citation

Consistent Inversion of Noisy Non‐Abelian X‐Ray Transforms

Monard

Nickl

Paternain

2020

Comm Pure Appl Math

142

View full text Add to dashboard Cite

For M a simple surface, the nonlinear statistical inverse problem of recovering a matrix field normalΦ:M→frakturso()n from discrete, noisy measurements of the SO(n)‐valued scattering data CΦ of a solution of a matrix ODE is considered (n ≥ 2). Injectivity of the map Φ ↦ CΦ was established by Paternain, Salo, and Uhlmann in 2012. A statistical algorithm for the solution of this inverse problem based on Gaussian process priors is proposed, and it is shown how it can be implemented by infinite‐dimensional MCMC methods. It is further shown that as the number N of measurements of point evaluations of CΦ increases, the statistical error in the recovery of Φ converges to 0 in L2(M)‐distance at a rate that is algebraic in 1/N and approaches 1/N for smooth matrix fields Φ. The proof relies, among other things, on a new stability estimate for the inverse map CΦ → Φ. Key applications of our results are discussed in the case n = 3 to polarimetric neutron tomography. © 2020 The Authors. Communications on Pure and Applied Mathematics published by Wiley Periodicals LLC

show abstract

Section: Remarks and Discussionmentioning

confidence: 99%

Section: Consistency Of the Posterior Mean: Proof Of Theorem 32mentioning

confidence: 99%

Consistent Inversion of Noisy Non‐Abelian X‐Ray Transforms

Monard

Nickl

Paternain

2020

Comm Pure Appl Math

142

View full text Add to dashboard Cite

show abstract

“…(1.9) below. Such small ball conditions have been extensively studied in the theory of Gaussian measures; see for instance [21,22]. An important result in that area of research shows that the small ball probability of a Gaussian measure μ is closely related to the behavior of the entropy numbers of the unit ball K μ of a certain reproducing kernel Hilbert space H μ associated with μ.…”

Section: Small Ball Probabilities and Gaussian Measuresmentioning

confidence: 99%

“…To see that the questions considered in [21,22] are different from the ones studied here, note that the Gaussian measures considered in [21,22] are not supported on K μ and furthermore that the entropy numbers of K μ always satisfy H (K μ , ε) ∈ o(ε −2 ) as ε → 0, a property that is in general not shared by the signal classes S = Ball 0,1;B τ p,q (Ω; R) and S := Ball 0, 1; W k, p (Ω) that we consider. Finally, we mention that a (non-trivial) modification of our proof shows that the measure P constructed in Theorem 1 can be chosen to be (the restriction of) a suitable centered Gaussian measure.…”

Section: Small Ball Probabilities and Gaussian Measuresmentioning

confidence: 99%

Phase Transitions in Rate Distortion Theory and Deep Learning

Klotz

Voigtlaender

2021

Found Comput Math

View full text Add to dashboard Cite

Rate distortion theory is concerned with optimally encoding signals from a given signal class $$\mathcal {S}$$ S using a budget of R bits, as $$R \rightarrow \infty $$ R → ∞ . We say that $$\mathcal {S}$$ S can be compressed at rates if we can achieve an error of at most $$\mathcal {O}(R^{-s})$$ O ( R - s ) for encoding the given signal class; the supremal compression rate is denoted by $$s^*(\mathcal {S})$$ s ∗ ( S ) . Given a fixed coding scheme, there usually are some elements of $$\mathcal {S}$$ S that are compressed at a higher rate than $$s^*(\mathcal {S})$$ s ∗ ( S ) by the given coding scheme; in this paper, we study the size of this set of signals. We show that for certain “nice” signal classes $$\mathcal {S}$$ S , a phase transition occurs: We construct a probability measure $$\mathbb {P}$$ P on $$\mathcal {S}$$ S such that for every coding scheme $$\mathcal {C}$$ C and any $$s > s^*(\mathcal {S})$$ s > s ∗ ( S ) , the set of signals encoded with error $$\mathcal {O}(R^{-s})$$ O ( R - s ) by $$\mathcal {C}$$ C forms a $$\mathbb {P}$$ P -null-set. In particular, our results apply to all unit balls in Besov and Sobolev spaces that embed compactly into $$L^2 (\varOmega )$$ L 2 ( Ω ) for a bounded Lipschitz domain $$\varOmega $$ Ω . As an application, we show that several existing sharpness results concerning function approximation using deep neural networks are in fact generically sharp. In addition, we provide quantitative and non-asymptotic bounds on the probability that a random $$f\in \mathcal {S}$$ f ∈ S can be encoded to within accuracy $$\varepsilon $$ ε using R bits. This result is subsequently applied to the problem of approximately representing $$f\in \mathcal {S}$$ f ∈ S to within accuracy $$\varepsilon $$ ε by a (quantized) neural network with at most W nonzero weights. We show that for any $$s > s^*(\mathcal {S})$$ s > s ∗ ( S ) there are constants c, C such that, no matter what kind of “learning” procedure is used to produce such a network, the probability of success is bounded from above by $$\min \big \{1, 2^{C\cdot W \lceil \log _2 (1+W) \rceil ^2 - c\cdot \varepsilon ^{-1/s}} \big \}$$ min { 1 , 2 C · W ⌈ log 2 ( 1 + W ) ⌉ 2 - c · ε - 1 / s } .

show abstract

“…; e.g., see the books by Kolmogorov and Tihomirov (1961), Lorentz (1966), Carl and Stephani (1990), Edmunds and Triebel (1996). Among many beautiful results are the duality theorem (Tomczak-Jaegermann (1987), Artstein et.al (2004)), and the small ball probability connection (Kuelbs and Li (1993), Li and Linde (1999)), which will be used in this paper. Nevertheless, the estimate of metric entropy for specific function classes remains difficult, especially the lower bound estimate, which often requires a construction of a wellseparated subset.…”

Section: Introductionmentioning

confidence: 97%