Population size estimation using incomplete multiple systems data
-
Operations Management Group, Indian Institute of Management, Calcutta [prajamitra.bhuyan@gmail.com]
-
Department of Statistics, Bidhannagar College, Kolkata [kiranmoy07@gmail.com]
1 Abstract
Motivated by real applications in disease surveillance, we consider the problem of estimating the population sizes based on incomplete triple record systems over different geographical regions. The size estimations of these populations often rely on the multiplier method—a variant of the capture-recapture approach assuming independence between lists or sources of information. However, the independence assumption is not valid in most realistic scenarios. Performances of the existing multiplier method-based estimators are not satisfactory when the sources of information are dependent. We propose a novel multivariate Bernoulli model to account the correlations among the lists and develop a Bayesian estimation based on data augmentation to produce a robust estimate of the population size. To extrapolate for the areas without data, the method is extended in a hierarchical setup that combines the models for multiple regions allowing us to borrow strength across the regions. The performance of the proposed model is evaluated based on an extensive simulation study followed by several real data applications.