We show that several machine learning estimators, including square-root LASSO (Least Absolute Shrinkage and Selection) and regularized logistic regression can be represented as solutions to distributionally robust optimization (DRO) problems. The associated uncertainty regions are based on suitably defined Wasserstein distances. Hence, our representations allow us to view regularization as a result of introducing an artificial adversary that perturbs the empirical distribution to account for out-of-sample effects in loss estimation. In addition, we introduce RWPI (Robust Wasserstein Profile Inference), a novel inference methodology which extends the use of methods inspired by Empirical Likelihood to the setting of optimal transport costs (of which Wasserstein distances are a particular case). We use RWPI to show how to optimally select the size of uncertainty regions, and as a consequence, we are able to choose regularization parameters for these machine learning estimators without the use of cross validation. Numerical experiments are also given to validate our theoretical findings.(A1) Management Science and Engineering, Stanford University
We propose a novel method for semi-supervised learning (SSL) based on data-driven distributionally robust optimization (DRO) using optimal transport metrics. Our proposed method enhances generalization error by using the non-labeled data to restrict the support of the worst case distribution in our DRO formulation. We enable the implementation of our DRO formulation by proposing a stochastic gradient descent algorithm which allows to easily implement the training procedure. We demonstrate that our Semi-supervised DRO method is able to improve the generalization error over natural supervised procedures and state-of-the-art SSL estimators. Finally, we include a discussion on the large sample behavior of the optimal uncertainty region in the DRO formulation. Our discussion exposes important aspects such as the role of dimension reduction in SSL.
China is home to three subspecies of tiger Panthera tigris but there are no estimates of the size of any of the populations. We detected a population of the Endangered Amur tiger Panthera tigris altaica in Hunchun Nature Reserve in Jilin Province using both mitochondrial DNA and nuclear microsatellite loci. Four male and one female tigers were detected, indicating the potential for a small breeding group. However, genetic diversity was low overall, with six loci showing a heterozygote deficiency and a mean of . alleles per locus. This study is the first estimate of the wild Amur tiger population in China to use non-invasive techniques, and the presence of a female tiger indicates this is a potentially viable population. We provide baseline genetic diversity estimates to support monitoring of the population. The small number of tiger scats located indicates the importance of continuing the current conservation efforts for this tiger subspecies in Hunchun Nature Reserve. Such efforts include reducing poaching of tigers and their prey, and implementation of management plans to encourage the persistence and recovery of tigers in this area.
Post-fire succession is an ideal case for studying effects of disturbance on community assembly, and the key is to disentangle the contributions of assembly processes to the variation of community composition, namely beta diversity, and the contingent scales. The central Yunnan Plateau of Southwest China is characterized by monsoon related seasonal drought, and frequent forest fires. We sampled five fire sites burned in different years and a middle aged forest, measured species composition dissimilarity and its species turnover and nestedness components, within each fire site and across all sites. Results indicated species turnover as the primary component of beta diversity within all communities. There was no trend of change with year-since-fire (YSF) in beta diversity among early post-fire communities, but beta diversity in the middle aged community was significantly higher. Species turnover patterns across fire sites revealed a weak dispersal limit effect, which was stronger at lower than upper slope position for woody plants, and reverse for herbs. At the site scale, the species dissimilarity and turnover both enlarged with increasing slope position difference, especially in the middle-aged community, but the species nestedness had no consistent trend among sites, except a decreasing trend in the middle-aged forest. (Partial) Mantel tests indicated habitat filtering [primarily indicating total nitrogen (TN) and slope position] played a much stronger role than dispersal limit and YSF (indicating competition intensity) for the post-fire forest assembly at the landscape scale, for both woody and herbaceous layers. However, at the site scale, Mantel tests indicated a diminishing effect of soil nutrient filtering with increasing YSF, while effects of topography and spatial distance in the middle aged community was stronger. This divergence suggests the primary assembly mechanism gradually shift away from the soil constraint. While the seasonal drought and the mountain topography dominate the environmental legacy, our results imply that fires may reinforce a priority effect in the forests assembly in this region, by creating a habitat filtering (e.g., moisture and nitrogen limitation) effect on species composition in post-fire communities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.