Cellwise and Casewise Robust Covariance
in High Dimensions
-
Section of Statistics and Data Science, Department of Mathematics, KU Leuven, Belgium; [fabio.centofanti@kuleuven.be, mia.hubert@kuleuven.be, peter@rousseeuw.net]
Keywords: High-dimensional data – Robust covariance estimation – Cellwise outliers – Regularization – Principal component analysis.
High-dimensional datasets often contain both cellwise and casewiseoutliers, posing significant challenges to traditional covariance estimation methods. We introduce cellCov, a robust covariance estimator that effectively handles such outliers while remaining computationally feasible for high-dimensional datasets. Our approach leverages the robustly estimated principal component subspace, obtained via the cellPCA method of Centofanti et al. [2024], to decompose the covariance structure and applies regularization techniques to ensure a well-conditioned covariance matrix estimate. Our simulations demonstrate that cellCov outperforms state-of-the-art robust covariance estimators in contaminated and high-dimensional settings, offering a reliable alternative for multivariate statistical applications such as discriminant analysis, canonical correlation analysis, and clustering of variables.
References
- Centofanti et al. [2024] F. Centofanti, M. Hubert, and P. J. Rousseeuw. Robust principal components by casewise and cellwise weighting. arXiv preprint arXiv:2408.13596, 2024.