Missing Data Mechanisms

A missing data mechanism is a probabilistic rule that governs which data will be observed and which will be missing. Little and Rubin7 and Rubin4 distinguish three types of missing data mechanisms. Missing data are missing completely at random (MCAR) if missingness is independent of both observed and missing values of all variables, almost random dart throwing at the data matrix. MCAR is the only missing data mechanism for which "complete-case" analysis (i.e., restricting the analysis to only those subjects with no missing data) is generally acceptable. Missing data are missing at random (MAR) if missingness depends only on observed values of variables and not on any missing values. For example, if the value of blood pressure at the end of a trial is more likely to be missing when some previously observed values of blood pressure are high, and given these the missingness is independent of the value of blood pressure at the end of the trial, then the missingness mechanism is MAR.

If missingness depends on the values that are missing, even after conditioning on all observed quantities, the missing data mechanism is not missing at random (NMAR). Missingness must then be modeled jointly with the data— the missingness mechanism is "nonignorable." Nonignorable missing data present challenging problems because there is no direct evidence in the observed data about how to model the missing values.

The specific imputation procedures described here are most appropriate when the missing data are MAR and ignorable (see Little and Rubin7 and Rubin4 for details). Multiple imputation can still be validly used with nonignorable missing data, although it is more challenging to use it well. Multiple imputation is still more straightforward to use than other valid methods of handling the nonignorable situation.

0 0

Post a comment