Data augmentation is required for the implementation of many Markov chain Monte Carlo (MCMC) algorithms. The inclusion of augmented data can often lead to conditional distributions from well-known probability distributions for some of the parameters in the model. In such cases, collapsing (integrating out parameters) has been shown to improve the performance of MCMC algorithms. We show how integrating out the infection rate parameter in epidemic models leads to efficient MCMC algorithms for two very different epidemic scenarios, final outcome data from a multitype SIR epidemic and longitudinal data from a spatial SI epidemic. The resulting MCMC algorithms give fresh insight into real-life epidemic data sets. Scand J Statist 44 algorithm can be implemented and the significant efficiency gains that it offers. Finally, we briefly summarize the findings of the paper in Section 5.
Generic collapsing setupIn this section, we outline the generic collapsing approach taken in this paper. This allows us to highlight the key elements in choosing the data augmentation and implementing the collapsing for epidemic models.Let  D . ; / and y D .v; w/ denote the parameters of the model and the augmented data, respectively. The parameters and augmented data are each divided into two sets with and v denoting parameters and augmented data, which are to be integrated out, and and w denoting the remaining parameters and augmented data. Throughout this section, we assume that is one dimensional, for ease of exposition and because this is the case in the examples in Sections 3 and 4 and generally likely to be the case in practice. However, the following discussion straightforwardly extends to being multidimensional, and even in the one-dimensional case, the effect on the performance of the MCMC algorithm can be dramatic as we highlight in Section 3.The joint posterior distribution of  and y satisfies