Estimation of a multivariate von Mises distribution for contaminated torus data

G. Bertagnolli1 L. Greco2 and C. Agostinelli3
  • 1

    Free University of Bozen-Bolzano, Italy [giulia.bertagnolli@unibz.it]

  • 2

    University G. Fortunato, Benevento, Italy [l.greco@unifortunato.eu]

  • 3

    University of Trento, Trento, Italy [claudio.agostinelli@unitn.i]

Keywords: Weighted likelihood – Circular data – ICORS – Robustness

1 Contamination in circular data

Observations of wind directions, arm movements and other angles measurements, but also time data over the 24 hours (0:00 corresponding to 0 and 24:00 to 360), due to their periodicity, can be thought and modelled as circular data, that is, points on the surface of a torus. Its particular topology requires careful adaptation of classical linear statistical methods for the analysis of observations on its surface, as well as specific families of distributions to model them. The von Mises distribution is a well-known distribution for circular data. The p-variate density of the von Mises sine distribution is

m(𝜽;𝝁,𝜿,Λ)=C-1(𝜿,Λ)exp[𝜿cos(𝜽-𝝁)-12sin(𝜽-𝝁)Λsin(𝜽-𝝁)] (1)

where 𝜽[0,2π)p is a p-variate circular random variable, 𝝁[0,2π)p is the location vector, 𝜿=(k1,,kp), with kj>0, is the concentration vector, Λij=λij=λji and Λjj=0 for i,j=1,,p [Mardia et al., 2008]. The normalising constant cannot be expressed in closed form for p>2, so we make the usual assumption that the concentration values kj are sufficiently large, and that the matrix Σ, defined as Σii-1=ki,Σij-1=-λij, is positive definite. In this case, the density in eq. (1) can be approximated by the concentrated multivariate sine distribution (CMS) of Mardia et al. [2012]

mc(𝜽;𝝁,𝜿,Λ)=(2π)-p/2|Σ|-1/2exp[-𝜿cos(𝜽-𝝁)-12sin(𝜽-𝝁)Λsin(𝜽-𝝁)].
Sample of size Sample of size
Figure 1: Sample of size n=250 from a bivariate von Mises distribution with 𝝁=(0,0), 𝜿0=(10,20) and λ0=15 with the addition of 50 outliers. (Left) Data are displayed as points on 𝕋2. (Right) Fitted density contours obtained using approximate maximum likelihood estimates superimposed.

Of course, circular data are not exempt from contaminations, which can very badly affect MLE-based inference. Figure 1 shows, in the left panel, a sample of 250 observations from a bivariate von Mises distribution, which has been contaminated with 50 outliers. The contours superimposed on the scatter plot in the right plane highlight how the contaminated estimates differ from the original parameter values. The weighted likelihood methodology [Lindsay, 1994, Markatou et al., 1998] has been proved both efficient and effective for robust estimation, and has been recently applied also to univariate [Agostinelli, 2007] and multivariate [Greco et al., 2021, Saraceno et al., 2021, Agostinelli et al., 2024] circular data. However, the study of the p-variate von Mises distribution is still missing. In [Bertagnolli et al., 2024], we fill this gap.

1.1 Weighted likelihood estimating equations

Intuitively, robustness is achieved by down-weighting, in the score estimating equation, those observations that are unlikely w.r.t. the assumed model. Let us assume 𝝁=𝟎, and denote by 𝝉=(𝜿,Λ) the parameters of interest, by mc(θ;𝝉) the assumed CMS model, by u(𝜽;𝝉)=𝝉logmc(𝜽;𝝉) the corresponding score, by F^n the empirical distribution function of a random sample, and by f* a non-parametric kernel estimate of the true unknown density f of 𝜽. The agreement of the i-th observation with the assumed model is quantified through its Pearson residual:

δ(𝜽𝒊)=δ(𝜽i;𝝉,F^n)=f*(𝜽i)mc*(𝜽i;𝝉)-1[-1,). (2)

δ(𝜽i)=0 indicates perfect agreement with the model, while values away from zero label 𝜽i as an out-lier, δ(𝒚), or in-lier, δ(𝒚)-1. To guarantee that the Pearson residuals converge to 0 with probability 1 when the model is correct, the model is also smoothed through the same kernel of f*, that is mc*. The WL estimating equation (WLEE) then reads

i=1nw(δ(𝜽i);𝝉,F^n)u(𝜽i;𝝉)=0, with w(δ)=A(δ)+1δ+1, (3)

where A(δ) is a so-called residual adjustment function (RAF), whose aim is to down-weight observations with large or negative residuals, but which also provides a direct link with disparity minimisation and the robustness properties of the derived estimator [Lindsay, 1994]. We use a symmetric chi-square (SCHI) RAF.

2 Results

8TIM protein data. Left: fitted density contours from WLE. Right: fitted density contours from MLE. The point estimate is denoted by the symbol X. 8TIM protein data. Left: fitted density contours from WLE. Right: fitted density contours from MLE. The point estimate is denoted by the symbol X.
Figure 2: 8TIM protein data. Left: fitted density contours from WLE. Right: fitted density contours from MLE. The point estimate is denoted by the symbol X.

The proposed technique proves satisfactory both on synthetic and real data examples. Figure 2 shows results of estimation based on WL (left) and ML (right) for the well-known protein 8TIM bivariate data, concerning n=490 backbone torsion angle pairs (ϕ,ψ) [Chakraborty and Wong, 2021]. This dataset has multiple clusters, which are not detected by MLE. The WL methodology, instead, gives strong indications of the presence of such sub-structures, which can be unveiled by further inspection of the multiple roots of the WLEE.

References

  • Agostinelli [2007] C. Agostinelli. Robust estimation for circular data. Computational Statistics & Data Analysis, 51(12):5867–5875, 2007.
  • Agostinelli et al. [2024] C. Agostinelli, L. Greco, and G. Saraceno. Weighted likelihood methods for robust fitting of wrapped models for p-torus data. AStA Advances in Statistical Analysis, pages 1–36, 2024.
  • Bertagnolli et al. [2024] G. Bertagnolli, L. Greco, and C. Agostinelli. Estimation of a multivariate von mises distribution for contaminated torus data, 2024.
  • Chakraborty and Wong [2021] S. Chakraborty and S. W. K. Wong. BAMBI: An R package for fitting bivariate angular mixture models. Journal of Statistical Software, 99(11):1–69, 2021.
  • Greco et al. [2021] L. Greco, G. Saraceno, and C. Agostinelli. Robust fitting of a wrapped normal model to multivariate circular data and outlier detection. Stats, 4(2):454–471, 2021.
  • Lindsay [1994] B. G. Lindsay. Efficiency versus robustness: The case for minimum hellinger distance and related methods. The Annals of Statistics, 22:1018–1114, 1994. doi: 10.1214/aos/1176325512.
  • Mardia et al. [2008] K. V. Mardia, G. Hughes, C. C. Taylor, and H. Singh. A multivariate von mises distribution with applications to bioinformatics. The Canadian Journal of Statistics, 1:99–109, 2008.
  • Mardia et al. [2012] K. V. Mardia, J. T. Kent, Z. Zhang, C. C. Taylor, and T. Hamelryck. Mixtures of concentrated multivariate sine distributions with applications to bioinformatics. Journal of Applied Statistics, 39(11):2475–2492, 2012.
  • Markatou et al. [1998] M. Markatou, A. Basu, and B. G. Lindsay. Weighted likelihood equations with bootstrap root search. Journal of the American Statistical Association, 93(442):740–750, 1998. doi: 10.2307/2670124.
  • Saraceno et al. [2021] G. Saraceno, C. Agostinelli, and L. Greco. Robust estimation for multivariate wrapped models. Metron, 79(2):225–240, 2021.