Location and Scatter Halfspace Depth for α-Symmetric Distributions

F. Bočinec1 and S. Nagy2
Abstract

In a seminal work, Chen et al. [2018] demonstrated that location and scatter medians based on halfspace depth achieve the optimal convergence rate under contaminated elliptical distributions. We extend these results by deriving concentration inequalities for halfspace medians under contaminated α-symmetric distributions, which encompass both elliptically symmetric and heavy-tailed distributions. Additionally, we identify key properties of scatter halfspace depth under α-symmetry.

  • 1

    Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic

    [bocinec@karlin.mff.cuni.cz]

  • 2

    Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic

    [nagy@karlin.mff.cuni.cz]

Keywords: Halfspace depth – Scatter halfspace depth – α-symmetric distributions – Contamination model

1 Halfspace Depth and α-Symmetric Distributions

In this contribution, we focus on location and scatter estimators induced by the halfspace depth for location and scatter, respectively. The halfspace depth is a well-studied tool in nonparametric statistics aimed at establishing concepts such as ordering, ranks, or quantiles for multivariate datasets. Denote by 𝒫(d) the set of Borel probability measures on d. The halfspace depth [Tukey, 1975] of a point 𝒙d with respect to P𝒫(d) is defined as

𝒟(𝒙;P)=inf𝒖𝕊d-1(𝑿,𝒖𝒙,𝒖),

where 𝑿P and 𝕊d-1={𝒖d:𝒖22=𝒖,𝒖=1}. It quantifies the centrality of 𝒙 within the geometry of the mass of P, and the deepest point 𝝁hs is called the halfspace median of P.

Similarly, a robust estimator of scatter is obtained using the scatter halfspace depth [Chen et al., 2018, Paindaveine and Van Bever, 2018]. The scatter halfspace depth of a d×d positive definite matrix 𝚺 with respect to P𝒫(d) is defined as

𝒮𝒟(𝚺;P) =inf𝒖𝕊d-1min{(|𝑿-T(P),𝒖|𝒖𝖳𝚺𝒖),
(|𝑿-T(P),𝒖|𝒖𝖳𝚺𝒖)},

where T:𝒫(d)d is a properly chosen location functional. A matrix 𝚺hs maximizing the scatter halfspace depth, called the scatter halfspace median matrix, offers a nonparametric alternative to the usual scatter estimators.

We explore the properties of halfspace depths under α-symmetric distributions. The distribution of a d-variate random vector 𝑿 is said to be α-symmetric [Fang et al., 1990] if the characteristic function of 𝑿 takes the form

ψ𝑿(𝒕)=𝔼exp(i𝒕,𝑿)=ϕ(𝒕α) for all 𝒕d,

where ϕ is a continuous function on . For α=2, the class of α-symmetric distributions reduces to spherically symmetric distributions. For α(0,2), the α-symmetric distributions form a rich family of multivariate models with numerous important applications. In particular, they include multivariate stable distributions, one of the most significant classes of distributions in probability theory.

2 Key Results

Our contribution extends the recent remarkable result of Chen et al. [2018], which demonstrates that under the classical Huber ε-contamination model, both the location halfspace median 𝝁hsd and the scatter halfspace median matrix 𝚺hs are minimax optimal for elliptical distributions. We aim to generalize some of these findings to the broader class of α-symmetric distributions.

In the context of location estimation, we establish an upper bound for the estimation deviation of the location halfspace median under the Huber contamination model for α-symmetric distributions. However, a similar result for the standard scatter halfspace median matrix is achievable only under the assumption of elliptical symmetry (α=2). This limitation arises from the fact that ellipticity is inherently tied to the definition of scatter halfspace depth. To address this, we modify the scatter halfspace depth to better accommodate α-symmetric distributions and derive an upper bound for the corresponding α-scatter median matrix. Furthermore, we identify several key properties of scatter halfspace depth for α-symmetric distributions, including continuity and uniqueness of the scatter median.

References

  • Bočinec and Nagy [to appear] Filip Bočinec and Stanislav Nagy. Concentration inequalities for location and scatter halfspace median under contaminated α-symmetric distribution. Manuscript submitted for publication, to appear.
  • Chen et al. [2018] Mengjie Chen, Chao Gao, and Zhao Ren. Robust covariance and scatter matrix estimation under Huber’s contamination model. Ann. Statist., 46(5):1932–1960, 2018.
  • Fang et al. [1990] Kai Tai Fang, Samuel Kotz, and Kai Wang Ng. Symmetric multivariate and related distributions, volume 36 of Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1990.
  • Paindaveine and Van Bever [2018] Davy Paindaveine and Germain Van Bever. Halfspace depths for scatter, concentration and shape matrices. Ann. Statist., 46(6B):3276–3307, 2018.
  • Tukey [1975] John W. Tukey. Mathematics and the picturing of data. In Proceedings of the International Congress of Mathematicians, Vol. 2, pages 523–531, 1975.