Yuheng Bu scite author profile

A mutual information based upper bound on the generalization error of a supervised learning algorithm is derived in this paper. The bound is constructed in terms of the mutual information between each individual training sample and the output of the learning algorithm, which requires weaker conditions on the loss function, but provides a tighter characterization of the generalization error than existing studies. Examples are further provided to demonstrate that the bound derived in this paper is tighter, and has a broader range of applicability. Application to noisy and iterative algorithms, e.g., stochastic gradient Langevin dynamics (SGLD), is also studied, where the constructed bound provides a tighter characterization of the generalization error than existing results.

show abstract

Estimation of KL Divergence: Optimal Minimax Rate

Zou

Liang

et al. 2018

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

The problem of estimating the Kullback-Leibler divergence D(P Q) between two unknown distributions P and Q is studied, under the assumption that the alphabet size k of the distributions can scale to infinity. The estimation is based on m independent samples drawn from P and n independent samples drawn from Q. It is first shown that there does not exist any consistent estimator that guarantees asymptotically small worstcase quadratic risk over the set of all pairs of distributions. A restricted set that contains pairs of distributions, with density ratio bounded by a function f (k) is further considered. An augmented plug-in estimator is proposed, and its worst-case quadratic risk is shown to be within a constant factor of (n , if m and n exceed a constant factor of k and kf (k), respectively. Moreover, the minimax quadratic risk is characterized to be within a constant factor of ( k m log k + kf (k)n , if m and n exceed a constant factor of k/ log(k) and kf (k)/ log k, respectively. The lower bound on the minimax quadratic risk is characterized by employing a generalized Le Cam's method. A minimax optimal estimator is then constructed by employing both the polynomial approximation and the plug-in approaches.

show abstract

China's economic growth and its real exchange rate

Tyers

Golley

Bain

et al. 2008

China Economic Journal

View full text Add to dashboard Cite

The shocks that underlie China's comparatively rapid growth include gains in productivity, factor accumulation and policy reforms that increase allocative efficiency. The well-known Balassa-Samuelson hypothesis links productivity growth in tradable industries with real appreciations. Yet it relies heavily on the law of one price applying for tradable goods, against which there is now considerable evidence. In its absence, other growth shocks also affect the real exchange rate by influencing relative supply or demand for home product varieties. This paper investigates the preconditions for the Balassa-Samuelson hypothesis to predict a real appreciation in the Chinese case. It then quantifies the links between all growth shocks and the Chinese real exchange rate using a dynamic model of the global economy with open capital accounts and full demographic underpinnings to labour supply. The results suggest that financial capital inflows most affect the real exchange rate in the short term, while differential productivity is strong in the medium term. Contrary to expectation, in the long term demographic forces prove to be weak relative to changes in the skill composition of the labour force which enhance services sector performance and depreciate the real exchange rate.

show abstract

A new species of Aspiculuris Schulz, 1924 (Nematoda, Heteroxynematidae) from the gray-sided vole, Clethrionomys rufocanus (Rodentia, Cricetidae), from Tianjin, China

Liu¹,

Zhang

2012

View full text Add to dashboard Cite

show abstract

Linear-Complexity Exponentially-Consistent Tests for Universal Outlying Sequence Detection

Zou

Veeravalli

2019

IEEE Trans. Signal Process.

View full text Add to dashboard Cite

The problem of universal outlying sequence detection is studied, where the goal is to detect outlying sequences among M sequences of samples. A sequence is considered as outlying if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences. In the universal setting, we are interested in identifying all the outlying sequences without knowing the underlying generating distributions. In this paper, a class of tests based on distribution clustering is proposed. These tests are shown to be exponentially consistent with linear time complexity in M . Numerical results demonstrate that our clustering-based tests achieve similar performance to existing tests, while being considerably more computationally efficient.

show abstract

Estimation of KL divergence between large-alphabet distributions

Zou

Liang

et al. 2016

View full text Add to dashboard Cite

Adaptive sequential machine learning

Wilson

Veeravalli

2019

Sequential Analysis

View full text Add to dashboard Cite

A framework previously introduced in [3] for solving a sequence of stochastic optimization problems with bounded changes in the minimizers is extended and applied to machine learning problems such as regression and classification. The stochastic optimization problems arising in these machine learning problems is solved using algorithms such as stochastic gradient descent (SGD). A method based on estimates of the change in the minimizers and properties of the optimization algorithm is introduced for adaptively selecting the number of samples at each time step to ensure that the excess risk, i.e., the expected gap between the loss achieved by the approximate minimizer produced by the optimization algorithm and the exact minimizer, does not exceed a target level. A bound is developed to show that the estimate of the change in the minimizers is nontrivial provided that the excess risk is small enough. Extensions relevant to the machine learning setting are considered, including a cost-based approach to select the number of samples with a cost budget over a fixed horizon, and an approach to applying cross-validation for model selection. Finally, experiments with synthetic and real data are used to validate the algorithms.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yuheng Bu

Tightening Mutual Information-Based Bounds on Generalization Error

Tightening Mutual Information Based Bounds on Generalization Error

Estimation of KL Divergence: Optimal Minimax Rate

China's economic growth and its real exchange rate

A new species of Aspiculuris Schulz, 1924 (Nematoda, Heteroxynematidae) from the gray-sided vole, Clethrionomys rufocanus (Rodentia, Cricetidae), from Tianjin, China

Linear-Complexity Exponentially-Consistent Tests for Universal Outlying Sequence Detection

Estimation of KL divergence between large-alphabet distributions

Adaptive sequential machine learning

Contact Info

Product

Resources

About