This paper takes a broad, pragmatic view of statistical inference to include all aspects of model formulation. The estimation of model parameters traditionally assumes that a model has a prespecified known form and takes no account of possible uncertainty regarding the model structure. This implicitly assumes the existence of a 'true' model, which many would regard as a fiction. In practice model uncertainty is a fact of life and likely to be more serious than other sources of uncertainty which have received far more attention from statisticians. This is true whether the model is specified on subject-matter grounds or, as is increasingly the case, when a model is formulated, fitted and checked on the same data set in an iterative, interactive way. Modern computing power allows a large number of models to be considered and data-dependent specification searches have become the norm in many areas of statistics. The term data mining may be used in this context when the analyst goes to great lengths to obtain a good fit. This paper reviews the effects of model uncertainty, such as too narrow prediction intervals, and the non-trivial biases in parameter estimates which can follow data-based modelling. Ways of assessing and overcoming the effects of model uncertainty are discussed, including the use of simulation and resampling methods, a Bayesian model averaging approach and collecting additional data wherever possible. Perhaps the main aim of the paper is to ensure that statisticians are aware of the problems and start addressing the issues even if there is no simple, general theoretical fix.
This case-study ®ts a variety of neural network (NN) models to the well-known airline data and compares the resulting forecasts with those obtained from the Box±Jenkins and Holt± Winters methods. Many potential problems in ®tting NN models were revealed such as the possibility that the ®tting routine may not converge or may converge to a local minimum. Moreover it was found that an NN model which ®ts well may give poor out-of-sample forecasts. Thus we think it is unwise to apply NN models blindly in`black box' mode as has sometimes been suggested. Rather, the wise analyst needs to use traditional modelling skills to select a good NN model, e.g. to select appropriate lagged variables as the`inputs'. The Bayesian information criterion is preferred to Akaike's information criterion for comparing different models. Methods of examining the response surface implied by an NN model are examined and compared with the results of alternative nonparametric procedures using generalized additive models and projection pursuit regression. The latter imposes less structure on the model and is arguably easier to understand.
Summary
The Holt‐Winters forecasting procedure is a simple widely used projection method which can cope with trend and seasonal variation. However, empirical studies have tended to show that the method is not as accurate on average as the more complicated Box‐Jenkins procedure. This paper points out that these empirical studies have used the automatic version of the method, whereas a non‐automatic version is also possible in which subjective judgement is employed, for example, to choose the correct model for seasonality. The paper re‐analyses seven series from the Newbold‐Granger study for which Box‐Jenkins forecasts were reported to be much superior to the (automatic) Holt‐Winters forecasts. The series do not appear to have any common properties, but it is shown that the automatic Holt‐Winters forecasts can often be improved by subjective modifications. It is argued that a fairer comparison would be that between Box‐Jenkins and a non‐automatic version of Holt‐Winters. Some general recommendations are made concerning the choice of a univariate forecasting procedure. The paper also makes suggestions regarding the implementation of the Holt‐Winters procedure, including a choice of starting values.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.