Quantile Integrated Depth and Applications
-
Department of Public Health and Health Sciences, Northeastern University, Boston, USA [s.lopez-pintado@northeastern.edu]
-
Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, San Diego, USA [maluo@ucsd.edu]
-
Department of Probability and Mathematical Statistics, Charles University, Praque, Czech Republic [nagy@karlin.mff.cuni.cz]
-
Department of Biostatistics, Columbia University, New York, USA [todd.ogden@columbia.edu]
1 Motivation and Summary
Functional data analysis involves data for which the basic unit of observation is a function or image. The development of robust exploratory tools and inferential methods is very much needed since few assumptions can be made about the generating process. Data depth, a well-known non-parametric tool for analyzing functional data, provides a rigorous method for ranking a sample of curves from the center outwards, allowing for robust inference and outlier detection. Several notions of depth for functional data have been introduced in the last few decades (e.g. Fraiman and Muniz, 2001, Lopez-Pintado and Romo, 2009, Narissty and Nair, 2016, among others). Here, we develop a new family of depths, termed quantile integrated depth (), that are based on integrating up to the -th quantile of the univariate depths. We show that this new family of depths satisfies all the desirable properties established in Gijbels and Nagy (2017), including a type of invariance, maximality at the center, and monotonicity with respect to the deepest point. generalizes the well-known integral and infimal depths and solves some of the drawbacks of these types of depths. In particular, since functional data are commonly observed with noise, we explore the effect of noise on different notions of depth. A visualization tool called the Spearman agreement depth (SAD) plot is introduced. The SAD plot compares depth measures of corresponding functional observations between two versions of a dataset, an original version of the data and a version of the data with additive noise. Compared to alternatives, the proposed is shown to be robust and performs well with noisy functional data. We also illustrate the advantages of using as a function of to identify potential hard-to-detect shape outliers.
2 Quantile Integrated Depth
Let be a space of functions for a set of positive finite Lebesgue measure, and . An example is the Banach space of continuous functions equipped with the supremum distance. Let be a Borel probability measure on the space of function . For we write for the marginal distribution of the random variable , with .
Suppose that a univariate depth is given. For a function and with probability , to each we attach the depth of the functional value with respect to the corresponding marginal distribution . We obtain a mapping
Consider now with its Borel subsets and a Lebesgue measure on as a probability space. Without loss of generality, we may suppose that , otherwise we just consider a properly normalized Lebesgue measure instead of . The map induces a pushforward Borel probability measure in . We denote that measure by , and its distribution function by
This measures the proportion of time that the point-wise depth of is below . We also write
for the corresponding quantile function. For a given , the quantile integrated depth of w.r.t. is defined as
(1) |
Intuitively, measures the integral of the smallest pointwise depths of function . The distinctive feature of the Quantile Integrated depth () is the attention it gives to the shape of the left lower tail of the pointwise depth distribution and this makes suitable for identifying possible shape functional outliers. Also, it can be shown that satisfies desirable theoretical properties and behaves well in terms of robustness in the presence of noise.
References
- Fraiman and Muniz [2001] R. Fraiman, and G. Muniz. Trimmed means for functional data. Test, 10:419–440, 2001.
- Gijbels and Nagy [2017] I. Gijbels, and S. Nagy. On a general definition of depth for functional data. Statist. Sci., 32(2):630–639, 2017.
- Lopez-Pintado and Romo [2009] S. Lopez-Pintado, and J. Romo. On the notion of depth for functional data. J. Amer. Statist. Assoc., 104:718–734, 2009.
- Narisetty and Nair [2016] N.N. Narisetty, and V.N. Nair. Extremal depth for functional data and applications. J. Amer. Statist. Assoc., 111:1705–1714, 2016.