We propose a new technique, called wild binary segmentation (WBS), for consistent estimation of the number and locations of multiple change-points in data. We assume that the number of change-points can increase to infinity with the sample size. Due to a certain random localisation mechanism, WBS works even for very short spacings between the change-points and/or very small jump magnitudes, unlike standard binary segmentation. On the other hand, despite its use of localisation, WBS does not require the choice of a window or span parameter, and does not lead to a significant increase in computational complexity. WBS is also easy to code. We propose two stopping criteria for WBS: one based on thresholding and the other based on what we term the 'strengthened Schwarz information criterion'. We provide default recommended values of the parameters of the procedure and show that it offers very good practical performance in comparison with the state of the art. The WBS methodology is implemented in the R package wbs, available on CRAN.In addition, we provide a new proof of consistency of binary segmentation with improved rates of convergence, as well as a corresponding result for WBS. Introduction.A posteriori change-point detection problems have been of interest to statisticians for many decades. Although, naturally, details vary, a theme common to many of them is as follows: a time-evolving quantity follows a certain stochastic model whose parameters are, exactly or approximately, piecewise constant. In such a model, it is of interest to detect the number of changes in the parameter values and the locations of the changes in time. Such piecewise-stationary modelling can be appealing for a number of reasons: the resulting model is usually much more flexible than the corresponding stationary model but still parametric if the number of change-points is fixed; the estimated change-points are often 'interpretable' in the sense that their locations can be linked to the behaviour of some exogenous quantities of interest; the last estimated segment can be viewed as the 'current' regime of stationarity, which can be useful in, for example, forecasting future values of the observed process. Finally, a posteriori segmentation can be a useful exploratory step in the construction of more complex models in which
Time series segmentation, a.k.a. multiple change-point detection, is a well-established problem. However, few solutions are designed specifically for high-dimensional situations. In this paper, our interest is in segmenting the second-order structure of a high-dimensional time series. In a generic step of a binary segmentation algorithm for multivariate time series, one natural solution is to combine CUSUM statistics obtained from local periodograms and cross-periodograms of the components of the input time series. However, the standard "maximum" and "average" methods for doing so often fail in high dimensions when, for example, the change-points are sparse across the panel or the CUSUM statistics are spuriously large.In this paper, we propose the Sparsified Binary Segmentation (SBS) algorithm which aggregates the CUSUM statistics by adding only those that pass a certain threshold. This "sparsifying" step reduces the impact of irrelevant, noisy contributions, which is particularly beneficial in high dimensions.In order to show the consistency of SBS, we introduce the multivariate Locally Stationary Wavelet model for time series, which is a separate contribution of this work.
Summary We propose a new, generic and flexible methodology for non‐parametric function estimation, in which we first estimate the number and locations of any features that may be present in the function and then estimate the function parametrically between each pair of neighbouring detected features. Examples of features handled by our methodology include change points in the piecewise constant signal model, kinks in the piecewise linear signal model and other similar irregularities, which we also refer to as generalized change points. Our methodology works with only minor modifications across a range of generalized change point scenarios, and we achieve such a high degree of generality by proposing and using a new multiple generalized change point detection device, termed narrowest‐over‐threshold (NOT) detection. The key ingredient of the NOT method is its focus on the smallest local sections of the data on which the existence of a feature is suspected. For selected scenarios, we show the consistency and near optimality of the NOT algorithm in detecting the number and locations of generalized change points. The NOT estimators are easy to implement and rapid to compute. Importantly, the NOT approach is easy to extend by the user to tailor to their own needs. Our methodology is implemented in the R package not.
This article introduces a new method for the estimation of the intensity of an inhomogeneous one-dimensional Poisson process. The Haar-Fisz transformation transforms a vector of binned Poisson counts to approximate normality with variance one. Hence we can use any suitable Gaussian wavelet shrinkage method to estimate the Poisson intensity. Since the Haar-Fisz operator does not commute with the shift operator we can dramatically improve accuracy by always cycle spinning before the Haar-Fisz transform as well as optionally after. Extensive simulations show that our approach usually significantly outperformed state-of-the-art competitors but was occasionally comparable. Our method is fast, simple, automatic, and easy to code. Our technique is applied to the estimation of the intensity of earthquakes in northern California. We show that our technique gives visually similar results to the current state-of-the-art.
We investigate the time-varying ARCH (tvARCH) process. It is shown that it can be used to describe the slow decay of the sample autocorrelations of the squared returns often observed in financial time series, which warrants the further study of parameter estimation methods for the model. Since the parameters are changing over time, a successful estimator needs to perform well for small samples. We propose a kernel normalized-least-squares (kernel-NLS) estimator which has a closed form, and thus outperforms the previously proposed kernel quasi-maximum likelihood (kernel-QML) estimator for small samples. The kernel-NLS estimator is simple, works under mild moment assumptions and avoids some of the parameter space restrictions imposed by the kernel-QML estimator. Theoretical evidence shows that the kernel-NLS estimator has the same rate of convergence as the kernel-QML estimator. Due to the kernel-NLS estimator's ease of computation, computationally intensive procedures can be used. A prediction-based cross-validation method is proposed for selecting the bandwidth of the kernel-NLS estimator. Also, we use a residual-based bootstrap scheme to bootstrap the tvARCH process. The bootstrap sample is used to obtain pointwise confidence intervals for the kernel-NLS estimator. It is shown that distributions of the estimator using the bootstrap and the ``true'' tvARCH estimator asymptotically coincide. We illustrate our estimation method on a variety of currency exchange and stock index data for which we obtain both good fits to the data and accurate forecasts.Comment: Published in at http://dx.doi.org/10.1214/07-AOS510 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.