Regression to the mean for the bivariate binomial distribution

Abstract

Regression to the mean (RTM) occurs when subjects having relatively high or low measurements are remeasured and found closer to the population mean. This phenomenon can potentially lead to an inaccurate conclusion in a pre-post study design. Expressions are available for quantifying RTM when the distribution of pre and post observations are bivariate normal and bivariate Poisson. However, situations exist where the response variables are the number of successes in a fixed number of trials and follow the bivariate binomial distribution. In this article, expressions for quantifying RTM effects are derived when the underlying distribution is the bivariate binomial. Unlike the normal and Poisson distributions, the correlation between pre and post observations can be either negative or positive under the bivariate binomial distribution and the severity of RTM is greater in the former case. The percentage relative difference is used to highlight the differences in quantifying RTM under the bivariate binomial distribution and normal and Poisson approximations to the bivariate binomial distribution. Expressions for estimating RTM using the method of maximum likelihood along with its asymptotic distribution are derived. A simulation study is conducted to empirically assess the statistical properties of the RTM estimator and its asymptotic distribution. Data examples using the number of obese individuals and the number of nonconforming cardboard cans are discussed. © 2019 John Wiley & Sons, Ltd.

Publication
Statistics in Medicine