Davidson
Publications
The bootstrap is a technique for performing statistical inference. The underlying idea is that most properties of an unknown distribution can be estimated as the same properties of an estimate of that distribution. In most cases, these properties must be estimated by a simulation experiment. The parametric bootstrap can be used when a statistical model is estimated using maximum likelihood since the parameter estimates thus obtained serve to characterise a distribution that can subsequently be used to generate simulated data sets. Simulated test statistics or estimators can then be computed for each of these data sets, and their distribution is an estimate of their distribution under the unknown distribution. The most popular sort of bootstrap is based on resampling the observations of the original data set with replacement in order to constitute simulated data sets, which typically contain some of the original observations more than once, some not at all. A special case of the bootstrap is a Monte Carlo test, whereby the test statistic has the same distribution for all data distributions allowed by the null hypothesis under test. A Monte Carlo test permits exact inference with the probability of Type I error equal to the significance level. More generally, there are two Golden Rules which, when followed, lead to inference that, although not exact, is often a striking improvement on inference based on asymptotic theory. The bootstrap also permits construction of confidence intervals of improved quality. Some techniques are discussed for data that are heteroskedastic, autocorrelated, or clustered.
The standard forms of bootstrap iteration are very computationally demanding. As a result, there have been several attempts to alleviate the computational burden by use of approximations. In this paper, we extend the fast double bootstrap of Davidson and MacKinnon (2007) to higher orders of iteration, and provide algorithms for their implementation. The new methods make computational demands that increase only linearly with the level of iteration, unlike standard procedures, whose demands increase exponentially. In a series of simulation experiments, we show that the fast triple bootstrap improves on both the standard and fast double bootstraps, in the sense that it suffers from less size distortion under the null with no accompanying loss of power.
In this study, we model realized volatility constructed from intra-day highfrequency data. We explore the possibility of confusing long memory and structural breaks in the realized volatility of the following spot exchange rates: EUR/USD, EUR/JPY, EUR/CHF, EUR/GBP, and EUR/AUD. The results show evidence for the presence of long memory in the exchange rates' realized volatility. FromtheBai-Perrontest,wefoundstructuralbreakpointsthatmatch significant events in financial markets. Furthermore, the findings provide strong evidence in favour of the presence of long memory.
Conventional wisdom says that the middle classes in many developed countries have recently suffered losses, in terms of both the share of the total population belonging to the middle class, and also their share in total income. Here, distribution-free methods are developed for inference on these shares, by means of deriving expressions for their asymptotic variances of sample estimates, and the covariance of the estimates. Asymptotic inference can be undertaken based on asymptotic normality. Bootstrap inference can be expected to be more reliable, and appropriate bootstrap procedures are proposed. As an illustration, samples of individual earnings drawn from Canadian census data are used to test various hypotheses about the middle-class shares, and confidence intervals for them are computed. It is found that, for the earlier censuses, sample sizes are large enough for asymptotic and bootstrap inference to be almost identical, but that, in the twenty-first century, the bootstrap fails on account of a strange phenomenon whereby many presumably different incomes in the data are rounded to one and the same value. Another difference between the centuries is the appearance of heavy right-hand tails in the income distributions of both men and women.
The bootstrap is typically less reliable in the context of time-series models with serial correlation of unknown form than when regularity conditions for the conventional IID bootstrap apply. It is, therefore, useful to have diagnostic techniques capable of evaluating bootstrap performance in specific cases. Those suggested in this paper are closely related to the fast double bootstrap (FDB) and are not computationally intensive. They can also be used to gauge the performance of the FDB itself. Examples of bootstrapping time series are presented, which illustrate the diagnostic procedures, and show how the results can cast light on bootstrap performance.
The bootstrap can be validated by considering the sequence of P values obtained by bootstrap iteration, rather than asymptotically. If this sequence converges to a random variable with the uniform U(0,1) distribution, the bootstrap is valid. Here, the model is made discrete and finite, characterised by a three-dimensional array of probabilities. This renders bootstrap iteration to any desired order feasible. A unit-root test for a process driven by a stationary MA(1) process is known to be unreliable when the MA(1) parameter is near −1. Iteration of the bootstrap P value to convergence achieves reliable inference unless the parameter value is very close to −1.
Testing the specification of econometric models has come a long way from the t tests and F tests of the classical normal linear model. In this paper, we trace the broad outlines of the development of specification testing, along the way discussing the role of structural versus purely statistical models. Inferential procedures have had to advance in tandem with techniques of estimation, and so we discuss the generalized method of moments, non parametric inference, empirical likelihood and estimating functions. Mention is made of some recent literature, in particular, of weak instruments, non parametric identification and the bootstrap.
A major contention in this paper is that scientific models can be viewed as virtual realities, implemented, or rendered, by mathematical equations or by computer simulations. Their purpose is to help us understand the external reality that they model. In economics, particularly in econometrics, models make use of random elements, so as to provide quantitatively for phenomena that we cannot or do not wish to model explicitly. By varying the realizations of the random elements in a simulation, it is possible to study counterfactual outcomes, which are necessary for any discussion of causality. The bootstrap is virtual reality within an outer reality. The principle of the bootstrap is that, if its virtual reality mimics as closely as possible the reality that contains it, it can be used to study aspects of that outer reality. The idea of bootstrap iteration is explored, and a discrete model discussed that allows investigators to perform iteration to any desired level.
An axiomatic approach is used to develop a one-parameter family of measures of divergence between distributions. These measures can be used to perform goodness-of-fit tests with good statistical properties. Asymptotic theory shows that the test statistics have well-defined limiting distributions which are, however, analytically intractable. A parametric bootstrap procedure is proposed for implementation of the tests. The procedure is shown to work very well in a set of simulation experiments, and to compare favorably with other commonly used goodness-of-fit tests. By varying the parameter of the statistic, one can obtain information on how the distribution that generated a sample diverges from the target family of distributions when the true distribution does not belong to that family. An empirical application analyzes a U.K. income dataset.
The most widely used measure of segregation is the so‐called dissimilarity index. It is now well understood that this measure also reflects randomness in the allocation of individuals to units (i.e. it measures deviations from evenness, not deviations from randomness). This leads to potentially large values of the segregation index when unit sizes and/or minority proportions are small, even if there is no underlying systematic segregation. Our response to this is to produce adjustments to the index, based on an underlying statistical model. We specify the assignment problem in a very general way, with differences in conditional assignment probabilities underlying the resulting segregation. From this, we derive a likelihood ratio test for the presence of any systematic segregation, and bias adjustments to the dissimilarity index. We further develop the asymptotic distribution theory for testing hypotheses concerning the magnitude of the segregation index and show that the use of bootstrap methods can improve the size and power properties of test procedures considerably. We illustrate these methods by comparing dissimilarity indices across school districts in England to measure social segregation.