Robust Statistics: Theory and Computation

Ispra (Varese), 15-17 May 2025

Aim

The School on Robust Statistics that will precede the International Conference on Robust Statistics (ICORS 2025), is an educational event designed for students, early-career researchers, but also experienced professionals seeking to deepen their expertise in the field of robust statistics. The school will offer a unique opportunity to learn from authoritative statisticians who are shaping the future of data science. The program is structured to combine theoretical insights with practical applications, providing participants with a comprehensive understanding of robust methods and their relevance in modern data science.

The school will cover a broad range of topics central to robust statistics, starting with foundational principles and concepts that underpin robustness, such as breakdown points, influence functions, and the asymptotic properties of robust estimation techniques. These foundational elements are crucial for understanding why robust methods are indispensable in real data analysis, particularly in scenarios where standard approaches fail.

Building on these fundamentals, the program will delve into advanced robustness topics in regression, multivariate methods, cell-wise outlier detection, divergence measures and high-dimensional data. Computational aspects will also play a significant role, with dedicated sessions on robust algorithms and their implementation in modern statistical software. Participants will gain hands-on experience working with cutting-edge tools and frameworks, equipping them to apply robust methods effectively in diverse contexts. Finally, students will have the possibility to interact with the lecturers on emerging areas in robust statistics, including their importance in the era of big data and artificial intelligence.

Whether you are a student just beginning your journey into robust statistics or an experienced researcher looking to stay at the forefront of the field, this school provides an opportunity to learn, connect, and grow in an intellectually stimulating environment.

Lectures will be delivered in English.

Where

Joint Research Centre of the European Commission

Varese Convegni [on Saturday 17 May]

  • Via Don Guanella 43
  • 21027 Barza, Ispra, Italy

Participation

Lectures

Rik Lopuhaä

Rik Lopuhaä

(Department of Mathematics, TU Delft)
Personal website

Multivariate Robust Estimation

In these series of lectures I will discuss popular methods for robust estimation in multivariate statistical models from a theoretical point of view. The main focus is on asymptotic properties of S-estimators for multivariate location and scatter and some of their improvements, such as MM-estimators and tau-estimators. We will introduce these methods, explain their relation with M-estimators, and discuss their robustness and asymptotic properties. The emphasis will be on explaining the mathematical techniques that can be used to derive these properties.

Furthermore, we will embed the multivariate location-scale model in a more general linear model, which also contains other multivariate statistical models of interest, such as multiple and multivariate linear regression and linear mixed effects models. We will explain how this leads to a unified approach for deriving theoretical properties of S-estimators and their improvements in the different multivariate statistical models mentioned above.

References

  • Maronna, R.A., Martin, R.D., Yohai, V.J., Salibian-Barrera, M. (2019). Robust Statistics: Theory and Methods (with R) (2019) Wiley & Sons
  • Heritier, S., Cantoni, E., Copt, S. and Victoria-Feser, M.-P. (2009). Robust methods in biostatistics. Wiley Series in Probability and Statistics. John Wiley & Sons, Ltd., Chichester.
  • Lopuhaä, H. P. (1989). On the relation between S-estimators and M-estimators of multivariate location and covariance. Annals of Statistics 17 1662-1683.
  • Lopuhaä, H. P. (1991). Multivariate tau-estimators for location and scatter. Canadian Journal of Statistics 19 307-321.
  • Lopuhaä, H. P. (1992). Highly efficient estimators of multivariate location with high breakdown point. Annals of Statistics 398-413.
  • Lopuhaä, H. P. (2022). Highly efficient estimators with high breakdown point for linear models with structured covariance matrices. To appear in Econometrics and Statistics.
  • Lopuhaä, H.P., Gares, V. and Ruiz-Gazen A. (2023). S-estimation in linear models with structured covariance matrices. Annals of Statistics, 51(6):2415-2439.
   

Abhik Ghosh

Abhik Ghosh

(Indian Statistical Institute)
Personal website

The Minimum Divergence Approach to Robust Inference

This lecture will focus on the principles of the minimum divergence estimation and testing procedures for generating robust statistical inference under possible data contaminations. We will primarily discuss the popular density power divergence in details and also briefly touch upon other related divergence measures that have been successfully used for robust estimation as well as testing of statistical hypotheses based on independent and identically distributed data. Asymptotic properties of the minimum density power divergence will be derived, along the theoretical justifications of their robustness through the concept of influence function analysis. Robust versions of the classical Likelihood-ratio, Wald and Rao tests, constructed utilizing the density power divergence, will be discussed with examples. Finally, extensions of the minimum density power divergence estimation under the independent non-homogeneous set-ups will be discussed, with applications to different regression models. If time permits, we will briefly touch upon the penalized minimum divergence estimation procedures for high-dimensional regression models.

References

  • Basu, A., Harris, I. R., Hjort, N. L., & Jones, M. C. (1998). Robust and efficient estimation by minimising a density power divergence. Biometrika, 85(3), 549-559.
  • Basu, A., Shioya, H., & Park, C. (2011). Statistical inference: the minimum distance approach. CRC press.
  • Ghosh, A., & Basu, A. (2013). Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electronic Journal of Statistics, 7, 2420-2456.
  • Basu, A., Chakraborty, S., Ghosh, A., & Pardo, L. (2022). Robust density power divergence based tests in multivariate analysis: A comparative overview of different approaches. Journal of Multivariate Analysis, 188, 104846.
   

Peter Rousseeuw

Peter Rousseeuw

(Statistics and Data Science, KU Leuven)
Website

Cellwise outliers and how to handle them

Multivariate data are typically represented by a rectangular matrix (table) in which the rows are the cases (objects) and the columns are the variables (measurements). It is well-known that real data may contain outliers, which are pieces of data that behave differently from the overall pattern. Depending on the situation, outliers may be undesirable errors which can adversely affect the data analysis, or valuable nuggets of unexpected information. In statistics and data analysis the word outlier usually refers to a case, that is, a row of the data matrix. But in recent years also cellwise outliers are receiving attention. These are suspicious cells (entries) that can occur anywhere in the data matrix, and need not show up in the marginals. Even a relatively small proportion of outlying cells can contaminate over half the rows, which may cause casewise robust methods to break down. Therefore, other approaches are being developed that can deal with outlying cells.

We will first describe the cellwise paradigm and address the detection of outlying cells, in combination with some practical data preprocessing techniques. Then we’ll look at some novel estimators designed to handle cellwise outliers, which can also deal with casewise outliers and missing values. One of these is a cellwise robust version of the minimum covariance determinant estimator for multivariate location and covariance matrices. In connection with this we will discuss some properties cellwise robust methods can have, and explain why some types of equivariance properties need to be abandoned. We then move on to high-dimensional data, and describe a cellwise robust principal component approach. Throughout we will provide examples and introduce implementations that are available in R and in Python.

References:

   

Marco Riani

Marco Riani

(Department of Economics and Management, University of Parma)
Personal website

Valentin Todorov

Valentin Todorov

(United Nations Industrial Development Organization: Vienna, AT)

Applied Robust Statistics through the Monitoring Approach

In this course we want to illustrate the contents of the new open access Springer Verlag book “Applied Robust Statistics through the Monitoring Approach: applications in regression”. After introducing robust estimation for a simple sample and contrasting analyses from the traditional robust approach with the monitoring approach, the course addresses traditional robust estimators and the monitoring approach in multiple regression. It also provides a theoretical and practical comparison of different estimators real and simulated data. Furthermore, the course presents robust transformation of the response in a regression model, non-parametric regression and several extensions of the robust multiple regression model, including Bayesian, heteroskedastic, time series and compositional regression, together with the clustering of regression models. Finally, it investigates several approaches to model selection and shows robust analyses of regression data that illustrate the use of techniques introduced earlier in the book. The course will appeal both to professional statisticians and researchers concerned with insightful data analysis, as well as to postgraduate students. It can also serve as material for a modern interactive robust regression course. Computer code in MATLAB and R is provided for all examples and exercises that are shown during the course.

References

  • Atkinson, A. C. and Riani, M. and Corbellini, A. and Perrotta, D. and Todorov, V. (2025), Applied Robust Statistics through the Monitoring Approach, Applications in Regression. Heidelberg: Springer Nature.
   

Accommodation

We have pre-booked 40 rooms for 3 nights (IN 14/5 – OUT 17/5) at:

  • Hotel Villa Borghi
  • Piazza Borghi, 1
  • 21020 Varano Borghi (VA)
  • Phone: +39 0332 961515
  • info@hotelvillaborghi.it
  • https://www.hotelvillaborghi.it/en
  • Room rate: €77 breakfast included (extras such as minibar and other meals will be paid at check-out).
  • To book, you have to contact the hotel directly mentioning the code ICORS2025 and send this form with your credit card details.
  • The hotel will ask for a pre-authorisation for the amount of 1 night to check the validity of the credit card (no charges on your account).
  • CANCELLATION POLICY: All changes and/or cancellations have to be communicated to the hotel 15 days before the arrival day. Cancellations sent between 15 and 7 days before arrival are subject to a fee equal to 50% of the total amount due. Late cancellations (6 days before arrival) will trigger a penalty equal to 100% of the amount due.

Transport School 15-17 May 2025

  • All transport is organised by the Joint Research Centre.
  • Participants are kindly asked to communicate their detailed travel information using the form at the link below by 18 April 2024:
  • ec.europa.eu/eusurvey/runner/ICORS2025Transport
  • There will be a shuttle bus from Milan Malpensa Terminal 1 to Hotel Villa Borghi on September 14 at 18:00
  • The JRC is also providing:
    • A shuttle bus from/to the hotel during the school, and transfer to Stresa in the afternoon of September 17
    • Timetable will be communicated in due time.

Contacts

For any questions about logistics, you are welcome to contact Lorena Marcaletti
JRC-T5-EVENTS@ec.europa.eu
Phone: +393485457379