On (in)consistency of M-estimators under contamination

J. Klooster1 and B. Nielsen2
  • 1

    Department of Economics, Econometrics and Finance, University of Groningen, Groningen, The Netherlands [j.klooster@rug.nl]

  • 2

    Nuffield College & Department of Economics, University of Oxford, Oxford, United Kingdom [bent.nielsen@nuffield.ox.ac.uk]

Keywords: Robust statistics – Location-scale – Boundedness

In the early 1960s, there was an increasing awareness that standard estimators for normal models may not fare well under deviations from normality. Huber [1964] proposed maximum likelihood-type (M) estimators for location and scale and found that they are more robust to such deviations than traditional estimators. Since then, robustness means that an estimate is only distorted in a bounded way when adding arbitrary observations to the sample [Hampel, 1971]. While statistical inference theory was developed for data with infinitesimal contamination [Heritier and Ronchetti, 1994], little theory is available for cases with more contamination.

We analyze popular M-estimators for location and scale when a fixed proportion of observations is contaminated. We consider two classes of M-estimators. The first class has an unbounded objective function. It includes the median and the Huber estimator. We find that these estimators are bounded in probability when more than half of the data is uncontaminated, which attests their robustness. Yet, these estimators are typically inconsistent under contamination.

The second class has a bounded objective function and includes the Tukey estimator. We derive a lower bound on the proportion of contamination ensuring that these estimators are bounded in probability and consistent under contamination. However, nuisance parameter free inference requires a consistent scale estimator. We show that robust scale estimators such as the interquartile range and median absolute deviation are inconsistent under contamination.

References

  • Hampel [1971] F. R. Hampel. A general qualitative definition of robustness Annals of Mathematical Statistics, 42(6):1887–1896, 1971.
  • Heritier and Ronchetti [1994] S. Heritier and E. Ronchetti. Robust bounded-influence tests in general parametric models. Journal of the American Statistical Association, 89(427):897–904, 1994.
  • Huber [1964] P. J. Huber. Robust estimation of a location parameter. Annals of Mathematical Statistics, 35(1):73–101, 1964.