The package High-dimensional Metrics (hdm) is an evolving collection of statistical methods for estimation and quantification of uncertainty in high-dimensional approximately sparse models. It focuses on providing confidence intervals and significance testing for (possibly many) lowdimensional subcomponents of the high-dimensional parameter vector. Efficient estimators and uniformly valid confidence intervals for regression coefficients on target variables (e.g., treatment or policy variable) in a high-dimensional approximately sparse regression model, for average treatment effect (ATE) and average treatment effect for the treated (ATET), as well for extensions of these parameters to the endogenous setting are provided. Theory grounded, data-driven methods for selecting the penalization parameter in Lasso regressions under heteroscedastic and non-Gaussian errors are implemented. Moreover, joint/ simultaneous confidence intervals for regression coefficients of a highdimensional sparse regression are implemented, including a joint significance test for Lasso regression. Data sets which have been used in the literature and might be useful for classroom demonstration and for testing new estimators are included. R and the package hdm are open-source software projects and can be freely downloaded from CRAN: http://cran.r-project.org.A.6. CPS data 32 References 34
IntroductionAnalysis of high-dimensional models, models in which the number of parameters to be estimated is large relative to the sample size, is becoming increasingly important. Such models arise naturally in modern data sets which have many measured characteristics available per individual observation as in, for example, population census data, scanner data, and text data. Such models also arise naturally even in data with a small number of measured characteristics in situations where the exact functional form with which the observed variables enter the model is unknown and we create many technical variables, a dictionary, from the raw characteristics. Examples covered by this scenario include semiparametric models with nonparametric nuisance functions. More generally, models with many parameters relative to the sample size often arise when attempting to model complex phenomena. With increasing availability of such data sets in economics and other data science fields, new methods for analyzing those data have been developed. The R package hdm contains implementations of recently developed methods for high-dimensional approximately sparse models, mainly relying on forms of lasso and post-lasso as well as related estimation and inference methods. The methods are illustrated with econometric applications, but are also useful in other disciplines such as medicine, biology, sociology or psychology.The methods which are implemented in this package are distinct from already available methods in other packages in the following four major ways:1) First, we provide a version of Lasso regression that expressly handles and allows for non-Gaussian and heteroscedastic er...
Eighty subjects estimated the correlation coefficient, r, for each of 13 computer-printed scatterplots. Making judgments were 46 students in a graduate-level statistics course and 34 faculty and graduate students in a department of psychology.The actual correlation values ranged from .010 to .995, with 200 observations in each scatterplot and with the order of scatterplot presentation randomized. As predicted, subjects underestimated the degree of actual correlation. Also as predicted, but with substantial moderation by a method-of-presentation factor, this underestimation was most pronounced in the middle of the correlational range—between the 0 and 1 extremes. Though perception of correlation was shown not to be veridical (i.e., in terms of r), little support was given one alternative view—its being in terms of r 2 .
Accurate and efficient plasma models are essential to understand and control experimental devices. Existing magnetohydrodynamic or kinetic models are nonlinear, computationally intensive, and can be difficult to interpret, while often only approximating the true dynamics. In this work, data-driven techniques recently developed in the field of fluid dynamics are leveraged to develop interpretable reduced-order models of plasmas that strike a balance between accuracy and efficiency. In particular, dynamic mode decomposition (DMD) is used to extract spatio-temporal magnetic coherent structures from the experimental and simulation datasets of the HIT-SI experiment. Three-dimensional magnetic surface probes from the HIT-SI experiment are analyzed, along with companion simulations with synthetic internal magnetic probes. A number of leading variants of the DMD algorithm are compared, including the sparsity-promoting and optimized DMD. Optimized DMD results in the highest overall prediction accuracy, while sparsity-promoting DMD yields physically interpretable models that avoid overfitting. These DMD algorithms uncover several coherent magnetic modes that provide new physical insights into the inner plasma structure. These modes were subsequently used to discover a previously unobserved three-dimensional structure in the simulation, rotating at the second injector harmonic. Finally, using data from probes at experimentally accessible locations, DMD identifies a resistive kink mode, a ubiquitous instability seen in magnetized plasmas.
A mechanism for steady inductive helicity injection (SIHI) current drive has been discovered where the current driving fluctuations are not generated by the plasma but rather are imposed by the injectors. Sheared flow of the electron fluid distorts the imposed fluctuations to drive current. The model accurately predicts the time dependent toroidal current, the injector impedance scaling, and the profile produced in the HIT-SI experiment. These results show that a stable equilibrium can be efficiently sustained with imposed fluctuations and the current profile can, in principle, be controlled. Both are large steps for controlled fusion. Some of the effects of the fluctuations on the confinement of tokamak and spheromak reactors are assessed and the degradation may be tolerable. The mechanism is also of interest to plasma self-organization, fast reconnection and plasma physics in general.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.