Robust and efficient estimation in the presence of a randomly right-censored covariate
-
Department of Statistics, University of Seoul, South Korea [szl562@psu.edu]
-
Department of Biostatistics, University of North Carolina at Chapel Hill, USA [brian_richardson@med.unc.edu]
-
Department of Statistics, Pennsylvania State University, USA [yzm63@psu.edu]
-
Department of Neurology, Columbia University Medical Center, USA [ksm1@cumc.columbia.edu]
-
Department of Biostatistics, University of North Carolina at Chapel Hill, USA [tpgarcia@email.unc.edu]
Keywords: censored covariate, doubly robust, efficient, Huntington’s disease, semiparametric
1 Abstract
In Huntington’s disease research, a current goal is to understand how symptoms change prior to a clinical diagnosis. Statistically, this entails modeling symptom severity as a function of the covariate ‘time until diagnosis’, which is often heavily right-censored in observational studies. Existing estimators that handle right-censored covariates have varying statistical efficiency and robustness to misspecified models for nuisance distributions (those of the censored covariate and censoring variable). On one extreme, complete case estimation, which utilizes uncensored data only, is free of nuisance distribution models but discards informative censored observations. On the other extreme, maximum likelihood estimation is maximally efficient but inconsistent when the covariate’s distribution is misspecified. We propose a new “SPARCC” estimator (for Semiparametric Right-Censored Covariate) that is robust and efficient. When the nuisance distributions are modeled parametrically, the SPARCC estimator is doubly robust, i.e., consistent if at least one distribution is correctly specified, and semiparametric efficient if both models are correctly specified. When the nuisance distributions are estimated consistently via nonparametric or machine learning methods, the estimator is consistent and semiparametric efficient. We show empirically that the proposed estimator, implemented in the R package sparcc, has its claimed properties, and we apply it to study Huntington’s disease symptom trajectories using data from the Enroll-HD study.