Robust marginal functional PCA for repeated functional measurements
-
Institute of Statistics and Mathematical Methods in Economics, TU Wien, Vienna, Austria [Peter.Filzmoser@tuwien.ac.at]
-
Chair of Statistics, School of Business and Economics, Humboldt-Universität zu Berlin
Keywords: Dependent Functional Data – Second-Generation Functional Data – Functional Outliers
We consider so-called second-generation functional data [Koner and Staicu, 2023] where functional data are sampled under assumptions that deviate from independence. More specifically, we consider functions sampled in longitudinal or repeated measurement designs. This form of functional data can be interpreted as realization of a bivariate stochastic process , , . We assume the process has a mean function and covariance function
One can think of as the spatial domain, and of as the time domain. The goal could then be to represent the functional data in a lower-dimensional space. However, instead of analyzing two-dimensional surfaces, it is simpler to separate the analysis of the dynamics in and , as suggested in Park and Staicu [2015] and Chen et al. [2017], and to consider a decomposition of the form
The basis functions capture the dynamics in the frequency domain and are obtained from an eigendecomposition of the marginal covariance function
(1) |
The resulting eigenpairs are referred to as marginal eigenvalues and functional principal components. They are optimal in the sense that they minimize the mean squared average reconstruction error, where the averaging is with respect to the time domain. Thus, the time-dynamics are captured by functions , formed by the scores on the corresponding components
The score functions are random functions and they can be treated as functional data as well. We denote their covariance functions by , with eigenpairs for each . Thus, the score functions admit a Karhunen-Loève expansion
The model estimation is performed in three steps:
- Step 1
-
Estimation of the mean function by employing a bivariate smoothing algorithm.
- Step 2
-
Estimation of the marginal covariance function (1) and the eigenpairs .
- Step 3
-
Estimation of the scores , followed by smoothing, to obtain estimates of the score functions at all points .
In order to achieve robust estimates, we propose to replace the estimation of the components suggested in Park and Staicu [2015] and Chen et al. [2017] by robust counterparts. Based on the robust estimates, outlier diagnostic tools are introduced, allowing to identify outlying observations. The proposed estimation procedure is outlined in detail in the presentation, and simulation results and real data examples will underline the usefulness of this procedure.
References
- Chen et al. [2017] K. Chen, P. Delicado, and H.-G. Müller. Modelling function-valued stochastic processes, with applications to fertility dynamics. Journal of the Royal Statistical Society Series B: Statistical Methodology, 79(1):177–196, 2017.
- Koner and Staicu [2023] S. Koner and A.-M. Staicu. Second-generation functional data. Annual Review of Statistics and Its Application, 10(1):547–572, 2023.
- Park and Staicu [2015] S.Y. Park and A.-M. Staicu. Longitudinal functional data analysis. Stat, 4(1):212–226, 2015.