Efficient Data Augmentation for Fitting Stochastic Epidemic Models to Prevalence Data

Jonathan Fintzi, Jon Wakefield, Vladimir Minin


June 26, 2016


Stochastic epidemic models describe the dynamics of an epidemic as a disease spreads through a population. Typically, only a fraction of cases are observed at a set of discrete times. The absence of complete information about the time evolution of an epidemic gives rise to a complicated latent variable problem in which the state space size of the unobserved epidemic grows large as the population size increases. This makes analytically integrating over the missing data infeasible for populations of even moderate size. We present a data-augmentation Markov chain Monte Carlo (MCMC) framework for Bayesian estimation of stochastic epidemic model parameters, in which measurements are augmented with subject–level trajectories. In our MCMC algorithm, we propose each new subject–level path, conditional on the data, using a time–inhomogeneous continuous–time Markov process with rates determined by the infection histories of other individuals. The method is general, and may be applied, with minimal modifications, to a broad class of stochastic epidemic models. We present our algorithm in the context of a general stochastic epidemic model in which the data are binomially sampled prevalence counts, and apply our method to data from an outbreak of influenza in a British boarding school.