Regularized estimation of Monge-Kantorovich quantiles for spherical data

B. Bercu1 J. Bigot2 G. Thurin3
  • 1

    Institut de Mathématiques de Bordeaux et CNRS [bernard.bercu@math.u-bordeaux.fr]

  • 2

    Institut de Mathématiques de Bordeaux et CNRS [jeremie.bigot@math.u-bordeaux.fr]

  • 3

    Institut de Mathématiques de Bordeaux et CNRS [gauthier-louis.thurin@math.u-bordeaux.fr]

Keywords: Statistical depth – Spherical data – Optimal transport

Submission for the invited session “Recent advances in statistical depth”.

1 Spherical quantiles

In various situations, data naturally correspond to directions that are modeled as observations belonging to the circle or the unit d-sphere 𝕊d-1 for d2. In the present work, we focus on the concepts of quantiles and depth for spherical data, for which a recent definition leverages ideas from measure transportation [Hallin et al., 2022]. The building block is the optimal transport problem between a target distribution and the uniform probability distribution on the sphere. This spherical extension follows the Euclidean definitions from Chernozhukov et al. [2017], which have given rise to a fast-growing field [Hallin, 2023].

Depending on the context, different estimators show different benefits. In the directional setting of Hallin et al. [2022], it is advocated to solve an optimal matching between two discrete distributions, ensuring finite-sample distribution-freeness of MK ranks, crucial for statistical testing. However, regularization is mandatory for applications that require out-of-sample estimates, i.e. when one is willing that the estimators interpolate between observations, as for depth-based contours.

To this end, we propose a new algorithm for computing regularized spherical quantiles, and we illustrate the benefits of our estimator for data analysis.

2 Regularized estimation

The estimation of the MK quantile function for a probability distribution ν requires to solve an OT problem between ν and a reference measure μ𝕊2, the uniform probability on 𝕊2. In the field of computational optimal transport [Cuturi and Peyré, 2019] it is well-known that the OT problem can be regularized by entropy (EOT) for faster algorithms. Consequently, we target EOT, both for smoothing and computational purposes. For ε>0 a regularization parameter, EOT between μ𝕊2 and ν writes as

maxuL(𝕊2)𝕊2u(x)𝑑μ𝕊2(x)+𝕊2uc,ε(y)𝑑ν(y), (1)

with uc,ε the smooth conjugate of u defined by

uc,ε(y)=-εlog(𝕊2exp(u(x)-c(x,y)ε)𝑑μ𝕊2(x)). (2)

Our proposal is to parameterize u in (1) by its spherical harmonics, to perform stochastic optimization on spherical harmonic coefficients. Based on the obtained estimator 𝐮^ε,n, we derive a regularized estimator for the MK quantile function 𝐐ε. This yields, even empirically, smooth maps that are not constrained to belong to the set of observed data.

In addition, we define the directional MK depth, a companion concept for MK quantiles, following Euclidean definitions [Chernozhukov et al., 2017]. We show that it benefits from desirable properties related to Liu-Zuo-Serfling axioms for the statistical analysis of directional data. Building on our regularized estimators, we illustrate the benefits of our methodology for inference, from descriptive analysis to depth-based classification.

References

  • Chernozhukov et al. [2017] Victor Chernozhukov, Alfred Galichon, Marc Hallin, and Marc Henry. Monge–Kantorovich depth, quantiles, ranks and signs. The Annals of Statistics, 45(1):223 – 256, 2017. doi: 10.1214/16-AOS1450. URL https://doi.org/10.1214/16-AOS1450.
  • Cuturi and Peyré [2019] M. Cuturi and G. Peyré. Computational optimal transport. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019.
  • Hallin [2023] Marc Hallin. Three applications of measure transportation in statistical inference. In Optimal Transport Statistics for Economics and Related Topics, pages 90–106. Springer, 2023.
  • Hallin et al. [2022] Marc Hallin, Hang Liu, and Thomas Verdebout. Nonparametric measure-transportation-based methods for directional data. arXiv, 2022.