Pötscher (1991, Econometric Theory7, 163–181) has recently considered the question of how the use of a model selection procedure affects the asymptotic distribution of parameter estimators and related statistics. An important potential application of such results is to the generation of confidence regions for the parameters of interest. It is demonstrated that a great deal of care must be exercised in any attempt at such an application. We also consider the effect of model selection on prediction regions. It is demonstrated that the use of asymptotic results for the construction of prediction regions requires the same sort of care as the use of such results for the construction of confidence regions.
We give a large-sample analysis of the minimal coverage probability of the usual confidence intervals for regression parameters when the underlying model is chosen by a "conservative" (or "overconsistent") model selection procedure. We derive an upper bound for the large-sample limit minimal coverage probability of such intervals that applies to a large class of model selection procedures including the Akaike information criterion as well as various pretesting procedures. This upper bound can be used as a safeguard to identify situations where the actual coverage probability can be far below the nominal level. We illustrate that the (asymptotic) upper bound can be statistically meaningful even in rather small samples.
We develop an approach to evaluating frequentist model averaging procedures by considering them in a simple situation in which there are two‐nested linear regression models over which we average. We introduce a general class of model averaged confidence intervals, obtain exact expressions for the coverage and the scaled expected length of the intervals, and use these to compute these quantities for the model averaged profile likelihood (MPI) and model‐averaged tail area confidence intervals proposed by D. Fletcher and D. Turek. We show that the MPI confidence intervals can perform more poorly than the standard confidence interval used after model selection but ignoring the model selection process. The model‐averaged tail area confidence intervals perform better than the MPI and postmodel‐selection confidence intervals but, for the examples that we consider, offer little over simply using the standard confidence interval for θ under the full model, with the same nominal coverage.
We consider a linear regression model with regression parameter = ( 1 , ... , p ) and independent and identically N(0, 2 ) distributed errors. Suppose that the parameter of interest is = a T where a is a specified vector. Define the parameter = c T − t where the vector c and the number t are specified and a and c are linearly independent. Also suppose that we have uncertain prior information that = 0. We present a new frequentist 1 − confidence interval for that utilizes this prior information. We require this confidence interval to (a) have endpoints that are continuous functions of the data and (b) coincide with the standard 1 − confidence interval when the data strongly contradict this prior information. This interval is optimal in the sense that it has minimum weighted average expected length where the largest weight is given to this expected length when = 0. This minimization leads to an interval that has the following desirable properties. This interval has expected length that (a) is relatively small when the prior information about is correct and (b) has a maximum value that is not too large. The following problem will be used to illustrate the application of this new confidence interval. Consider a 2 × 2 factorial experiment with 20 replicates. Suppose that the parameter of interest is a specified simple effect and that we have uncertain prior information that the two-factor interaction is zero. Our aim is to find a frequentist 0.95 confidence interval for that utilizes this prior information.
Consider X 1 , X 2 , . . . , X n that are independent and identically N(µ, σ 2 ) distributed.Suppose that we have uncertain prior information that µ = 0. We answer the question: to what extent can a frequentist 1−α confidence interval for µ utilize this prior information?
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.