Robust tests for equality of regression curves
based on characteristic functions
-
Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina [gboente@dm.uba.ar]
-
Centro de Investigación e Tecnoloxía Matemática de Galicia (CITMAga), Universidade de Vigo, Spain [juancp@uvigo.gal]
Keywords: Hypothesis testing – Nonparametric regression models – Robust estimation – Smoothing techniques.
1 Abstract
Let us assume that the random vectors , , follow the homoscedastic nonparametric regression models
(1) |
where is a nonparametric smooth function and the error is independent of the covariate . The nonparametric nature of model (1) offers more flexibility than the standard linear model when modelling a complicated relationship between the response variable and the covariate. When second moments exist, as it is the case of the classical approach, the usual assumption is that and , which means that represents the conditional mean, while equals the errors variance, i.e., . As is usual in a robust framework, we will avoid first moment conditions and we will require that the errors distribution has scale . Furthermore, to identify we will impose an identifiability assumption depending on the score function which holds whenever the errors have a symmetric distribution. Henceforth, we assume that the covariates have the same support , even when they may have different densities.
In many situations, it is of interest to compare the regression functions , , to decide if the same functional form appears in all populations. In this talk, we focus on testing the null hypothesis of equality of the regression curves at least in some region of the common support , versus a general alternative. The null hypothesis to be considered is
(2) |
while the alternative hypothesis is .
When second moments exist, the problem of testing equality of two regression curves has been considered by several authors such as Dette and Munk [1998] and Neumeyer and Dette [2003]. Pardo-Fernández et al. [2007] proposed Kolmogorov–Smirnov and Cramér–von Mises type statistics to test (2). Later on, Pardo-Fernández et al. [2015] introduced a statistic based on the residuals characteristic functions which can detect local alternatives converging to the null hypothesis at rate root- and whose values do not rely on bootstrap.
The main reason to provide a robust counterpart to the procedure described in Pardo-Fernández et al. [2015] is that their method is based on linear kernel regression estimators which locally average the responses resulting in estimators sensitive to atypical observations. As it has been extensively discussed, atypical data in the responses in nonparametric regression may lead to a complete distorted estimation which may influence the test statistic and the conclusions of the testing procedure. Hence, robust estimates are needed to provide more reliable estimations and inferences.
In the nonparametric setting, robust testing procedures are scarce. Robust tests based on the distance between non-crossing non-parametric estimates of the quantile curves were defined in Dette et al. [2011, 2013]. When the errors in both populations have the same distribution and the design points have equal densities, Koul and Schick [1997] defined a family of covariate–matched statistics. To extend their proposal to the situation of different errors distribution and possible different error densities, Koul and Schick [2003] developed a modified version of previous procedure, but this statistic assumes the existence of second moments and may be affected by atypical data arise in the responses. Recently, Boente and Pardo-Fernández [2016] considered the problem of testing equality of two regression functions versus a one-sided alternative. Finally, Feng et al. [2015] considered a test for versus using a generalized likelihood ratio test incorporating a Wilcoxon likelihood function and kernel smoothers, which allows to detect alternatives with non–parametric rate. It is worth mentioning that to obtain asymptotic results Feng et al. [2015] assumed that the errors have second moments.
In this talk, to provide more reliable inferences, we introduce a test statistic that combines characteristic functions and residuals obtained from a robust smoother under the null hypothesis. The asymptotic distribution of the test statistic presented in this talk do not assume second moment conditions on the regression errors. Results of a Monte Carlo study performed to compare the finite sample behaviour of the proposed test with the classical one obtained using local averages will be described.
References
- Boente and Pardo-Fernández [2016] G. Boente and J. C. Pardo-Fernández. Robust testing for superiority between two regression curves. Computational Statistics and Data Analysis, 97:151–168, 2016.
- Dette and Munk [1998] H. Dette and A. Munk. Testing heteroscedasticity in nonparametric regression. Journal of the Royal Statistical Society, Series B, 60:693–708, 1998.
- Dette et al. [2011] H. Dette, J. Wagener, and S. Volgushev. Comparing conditional quantile curves. Scandinavian Journal of Statistics, 38:63–88, 2011.
- Dette et al. [2013] H. Dette, J. Wagener, and S. Volgushev. Nonparametric comparison of quantile curves: A stochastic process approach. Journal of Nonparametric Statistics, 25:243–260, 2013.
- Feng et al. [2015] L. Feng, C. Zou, Z. Wang, and L. Zhu. Robust comparison of regression curves. Test, 24:185–204, 2015.
- Koul and Schick [1997] H.L. Koul and A. Schick. Testing for the equality of two nonparametric regression curves. Journal of Statistical Planning and Inference, 65:293–314, 1997.
- Koul and Schick [2003] H.L. Koul and A. Schick. Testing for superiority among two regression curves. Journal of Statistical Planning and Inference, 117:15–33, 2003.
- Neumeyer and Dette [2003] N. Neumeyer and H. Dette. Nonparametric comparison of regression curves: An empirical process approach. Annals of Statistics, 31:880–920, 2003.
- Pardo-Fernández et al. [2007] J. C. Pardo-Fernández, I. Van Keilegom, and W. Gonzánez-Manteiga. Testing for the equality of regression curves. Statistica Sinica, 17:1115–1137, 2007.
- Pardo-Fernández et al. [2015] J. C. Pardo-Fernández, M. D. Jiménez-Gamero, and A. El Ghouch. A non-parametric ANOVA-type test for regression curves based on characteristic functions. Scandinavian Journal of Statistics, 42:197–213, 2015.