Improved subsample-and-aggregate via the private modified winsorized mean

K. Ramsay1 and D. Spicker2
  • 1

    Department of Mathematics and Statistics, York University, Toronto, Canada[kramsay2@yorku.ca]

  • 2

    Department of Mathematics and Statistics, University of New Brunswick, Saint John, Canada[dylan.spicker@unb.ca]

Keywords: Differential privacy – mean estimation – robustness – subsample-and-aggregate – winsorized mean

1 Abstract

We present a univariate, differentially private mean estimator, called the private modified winsorized mean, designed to be used as the aggregator in subsample-and-aggregate. We demonstrate, via real data analysis, that common differentially private multivariate mean estimators may not perform well as the aggregator, even with a dataset with 8000 observations, motivating our developments. We show that the modified winsorized mean is minimax optimal for several, large classes of distributions, even under adversarial contamination. We consider the modified winsorized mean as the aggregator in subsample-and-aggregate, deriving a finite sample deviations bound for a subsample-and-aggregate estimate generated with the new aggregator. This result yields two important insights: (i) the optimal choice of subsamples depends on the bias of the estimator computed on the subsamples, and (ii) the rate of convergence of the subsample-and-aggregate estimator depends on the robustness of the estimator computed on the subsamples.