A class of variable selection procedures for parametric models via nonconcave
penalized likelihood was proposed by Fan and Li to simultaneously estimate
parameters and select important variables. They demonstrated that this class of
procedures has an oracle property when the number of parameters is finite.
However, in most model selection problems the number of parameters should be
large and grow with the sample size. In this paper some asymptotic properties
of the nonconcave penalized likelihood are established for situations in which
the number of parameters tends to \infty as the sample size increases.
Under regularity conditions we have established an oracle property and the
asymptotic normality of the penalized likelihood estimators. Furthermore, the
consistency of the sandwich formula of the covariance matrix is demonstrated.
Nonconcave penalized likelihood ratio statistics are discussed, and their
asymptotic distributions under the null hypothesis are obtained by imposing
some mild conditions on the penalty functions
Independence screening is a variable selection method that uses a ranking
criterion to select significant variables, particularly for statistical models
with nonpolynomial dimensionality or "large p, small n" paradigms when p can be
as large as an exponential of the sample size n. In this paper we propose a
robust rank correlation screening (RRCS) method to deal with ultra-high
dimensional data. The new procedure is based on the Kendall \tau correlation
coefficient between response and predictor variables rather than the Pearson
correlation of existing methods. The new method has four desirable features
compared with existing independence screening methods. First, the sure
independence screening property can hold only under the existence of a second
order moment of predictor variables, rather than exponential tails or
alikeness, even when the number of predictor variables grows as fast as
exponentially of the sample size. Second, it can be used to deal with
semiparametric models such as transformation regression models and single-index
models under monotonic constraint to the link function without involving
nonparametric estimation even when there are nonparametric functions in the
models. Third, the procedure can be largely used against outliers and influence
points in the observations. Last, the use of indicator functions in rank
correlation screening greatly simplifies the theoretical derivation due to the
boundedness of the resulting statistics, compared with previous studies on
variable screening. Simulations are carried out for comparisons with existing
methods and a real data example is analyzed.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1024 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org). arXiv admin note: text overlap with
arXiv:0903.525
Ultra-high dimensional longitudinal data are increasingly common and the analysis is challenging both theoretically and methodologically. We offer a new automatic procedure for finding a sparse semivarying coefficient model, which is widely accepted for longitudinal data analysis. Our proposed method first reduces the number of covariates to a moderate order by employing a screening procedure, and then identifies both the varying and constant coefficients using a group SCAD estimator, which is subsequently refined by accounting for the within-subject correlation. The screening procedure is based on working independence and B-spline marginal models. Under weaker conditions than those in the literature, we show that with high probability only irrelevant variables will be screened out, and the number of selected variables can be bounded by a moderate order. This allows the desirable sparsity and oracle properties of the subsequent structure identification step. Note that existing methods require some kind of iterative screening in order to achieve this, thus they demand heavy computational effort and consistency is not guaranteed. The refined semivarying coefficient model employs profile
This experiment was conducted to determine the effect of Bacillus subtilis natto fermentation product supplementation on blood metabolites, rumen fermentation and milk production and composition in early lactation dairy cows. Thirty-six multiparous Holstein cows (DIM = 29 ± 6 days, parity = 2.8 ± 1.1) were blocked by DIM and parity and then randomly assigned to three treatments (12 per treatment) in a 9-week trial. Cows in control, DFM1 and DFM2 were fed TMR diets supplemented with 0, 6 and 12 g of B. subtilis natto solid-state fermentation product per day per cow respectively. Plasma non-esterified fatty acids were lower (p = 0.03) in DFM1 and DFM2 compared with control cows (633 and 639 vs. 685 μm). Ruminal propionate increased (23.9 vs. 26.3 and 26.9/100 mol, control vs. DFM1 and DFM2 respectively) and acetate decreased (64.2 vs. 62.7 and 62.1/100 mol, control vs. DFM1 and DFM2 respectively) with increasing B. subtilis natto fermentation product supplementation. DMI of the cows in three treatments was not affected by B. subtilis natto fermentation product supplementation, but milk yield was 3.1 and 3.2 kg/day higher for DFM1 and DFM2 than that for control cows on average across the 9-week trial, and significant differences were observed during weeks 5-9 of the trial, which resulted in 9.5% and 11.7% increase in feed efficiency. B. subtilis natto fermentation product supplementation did not affect milk fat percentage and protein yield but increased (p < 0.05) milk fat yield and lactose percentage (p < 0.01) and tended to decrease protein percentage (p = 0.06). The findings show that B. subtilis natto fermentation product was effective in increasing lactation performance of early lactation dairy cows possibly by altering the rumen fermentation pattern without any negative effects on blood metabolites.
We develop a specification test for the transition density of a discretely sampled continuous-time jump-diffusion process, based on a comparison of a nonparametric estimate of the transition density or distribution function with their corresponding parametric counterparts assumed by the null hypothesis. As a special case, our method applies to pure diffusions. We provide a direct comparison of the two densities for an arbitrary specification of the null parametric model using three different discrepancy measures between the null and alternative transition density and distribution functions. We establish the asymptotic null distributions of proposed test statistics and compute their power functions. We investigate the finite-sample properties through simulations and compare them with those of other tests. This article has supplementary material online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.