# Davidson

## Publications

The most widely used measure of segregation is the so‐called dissimilarity index. It is now well understood that this measure also reflects randomness in the allocation of individuals to units (i.e. it measures deviations from evenness, not deviations from randomness). This leads to potentially large values of the segregation index when unit sizes and/or minority proportions are small, even if there is no underlying systematic segregation. Our response to this is to produce adjustments to the index, based on an underlying statistical model. We specify the assignment problem in a very general way, with differences in conditional assignment probabilities underlying the resulting segregation. From this, we derive a likelihood ratio test for the presence of any systematic segregation, and bias adjustments to the dissimilarity index. We further develop the asymptotic distribution theory for testing hypotheses concerning the magnitude of the segregation index and show that the use of bootstrap methods can improve the size and power properties of test procedures considerably. We illustrate these methods by comparing dissimilarity indices across school districts in England to measure social segregation.

An axiomatic approach is used to develop a one-parameter family of measures of divergence between distributions. These measures can be used to perform goodness-of-fit tests with good statistical properties. Asymptotic theory shows that the test statistics have well-defined limiting distributions which are, however, analytically intractable. A parametric bootstrap procedure is proposed for implementation of the tests. The procedure is shown to work very well in a set of simulation experiments, and to compare favorably with other commonly used goodness-of-fit tests. By varying the parameter of the statistic, one can obtain information on how the distribution that generated a sample diverges from the target family of distributions when the true distribution does not belong to that family. An empirical application analyzes a U.K. income dataset.

Economists are often interested in the coefficient of a single endogenous explanatory variable in a linear simultaneous-equations model. One way to obtain a confidence set for this coefficient is to invert the Anderson-Rubin (AR) test. The AR confidence sets that result have correct coverage under classical assumptions. However, AR confidence sets also have many undesirable properties. It is well known that they can be unbounded when the instruments are weak, as is true of any test with correct coverage. However, even when they are bounded, their length may be very misleading, and their coverage conditional on quantities that the investigator can observe (notably, the Sargan statistic for overidentifying restrictions) can be far from correct. A similar property manifests itself, for similar reasons, when a confidence set for a single parameter is based on inverting an F-test for two or more parameters.

We study several methods of constructing confidence sets for the coefficient of the single right-hand-side endogenous variable in a linear equation with weak instruments. Two of these are based on conditional likelihood ratio (CLR) tests, and the others are based on inverting t statistics or the bootstrap P values associated with them. We propose a new method for constructing bootstrap confidence sets based on t statistics. In large samples, the procedures that generally work best are CLR confidence sets using asymptotic critical values and bootstrap confidence sets based on limited-information maximum likelihood (LIML) estimates.

The understanding of causal chains and mechanisms is an essential part of any scientific activity that aims at better explanation of its subject matter, and better understanding of it. While any account of causality requires that a cause should precede its effect, accounts of causality inphysics are complicated by the fact that the role of time in current theoretical physics has evolved very substantially throughout the twentieth century. In this article, I review the status of time and causality in physics, both the classical physics of the nineteenth century, and modern physics based on relativity and quantum mechanics. I then move on to econometrics, with some mention of statistics more generally, and emphasise the role of models in making sense of causal notions, and their place in scientific explanation

Asymptotic and bootstrap tests are studied for testing whether there is a relation of stochastic dominance between two distributions. These tests have a null hypothesis of nondominance, with the advantage that, if this null is rejected, then all that is left is dominance. This also leads us to define and focus on restricted stochastic dominance, the only empirically useful form of dominance relation that we can seek to infer in many settings. One testing procedure that we consider is based on an empirical likelihood ratio. The computations necessary for obtaining a test statistic also provide estimates of the distributions under study that satisfy the null hypothesis, on the frontier between dominance and nondominance. These estimates can be used to perform dominance tests that can turn out to provide much improved reliability of inference compared with the asymptotic tests so far proposed in the literature.

Income distributions are usually characterized by a heavy right-hand tail. Apart from any ethical considerations raised by the presence among us of the very rich, statistical inference is complicated by the need to consider distributions of which the moments may not exist. In extreme cases, no valid inference about expectations is possible until restrictions are imposed on the class of distributions admitted by econometric models. It is therefore important to determine the limits of conventional inference in the presence of heavy tails, and, in particular, of bootstrap inference. In this paper, recent progress in the field is reviewed, and examples given of how inference may fail, and of the sorts of conditions that can be imposed to ensure valid inference.

This paper attempts to provide a synthetic view of varied techniques available for performing inference on income distributions. Two main approaches can be distinguished: one in which the object of interest is some index of income inequality or poverty, the other based on notions of stochastic dominance. From the statistical point of view, many techniques are common to both approaches, although of course some are specific to one of them. I assume throughout that inference about population quantities is to be based on a sample or samples, and, formally, all randomness is due to that of the sampling process. Inference can be either asymptotic or bootstrap based. In principle, the bootstrap is an ideal tool, since in this paper I ignore issues of complex sampling schemes and suppose that observations are IID. However, both bootstrap inference and, to a considerably greater extent, asymptotic inference can fall foul of difficulties associated with the heavy right-hand tails observed with many income distributions. I mention some recent attempts to circumvent these difficulties.

Testing for a unit root in a series obtained by summing a stationary MA(1) process with a parameter close to -1 leads to serious size distortions under the null, on account of the near cancellation of the unit root by the MA component in the driving stationary series. The situation is analysed from the point of view of bootstrap testing, and an exact quanti- tative account is given of the error in rejection probability of a bootstrap test. A particular method of estimating the MA parameter is recommended, as it leads to very little distortion even when the MA parameter is close to -1. A new bootstrap procedure with still better properties is proposed. While more computationally demanding than the usual bootstrap, it is much less so than the double bootstrap.

Bayesians and non-Bayesians, often called frequentists, seem to be perpetually at logger-heads on fundamental questions of statistical inference. This paper takes as agnostic a stand as is possible for a practising frequentist, and tries to elicit a Bayesian answer to questions of interest to frequentists. The argument is based on my presentation at a debate organised by the Rimini Centre for Economic Analysis, between me as the frequentist "advocate", and Christian Robert on the Bayesian side.