Missing data are a big concern in any research project; however, despite investigators’ best efforts they are often unavoidable. Missing outcomes have two effects: they reduce precision and power, and they may introduce bias. The statistician can’t do much about the loss of precision, except to make best use of the data that are available; e.g. to be sure not to exclude from the analysis individuals who dropped out before the end of the study but who nevertheless reported intermediate values of the outcome. However, the statistician can aim to reduce bias through suitable choice of an analysis.

In randomized controlled trials (RCTs), outcome data are typically missing for some participants. Patient-reported outcomes such as health-related quality of life (QoL) are particularly prone to missing data because patients may fail to complete follow-up questionnaires. Statistical analyses for missing data use assumptions, wherein some explicitly specify the values of the missing data: e.g. missing values being failures, as in smoking cessation trials. Other assumptions make implicit assumptions about the similarity of distributions, such as ‘last observation carried forward’. Making assumptions about the missing data mechanism, defined as the probability of missing data given the observed and unobserved data, is usually preferable.

In the primary trial analysis, studies are recommended to take an approach that is valid under plausible assumptions about the missing data. Instead of assuming that the data are ‘missing completely at random’ (MCAR), the primary analysis should suppose they are ‘missing at random’ (MAR), that is, the probability of missing data does not depend on the patient’s outcome, after conditioning on the observed variables (e.g. the patients baseline characteristics). However, it may be unlikely to use the MAR assumption in many settings; for example, patients in relatively poor health may be less likely to complete the requisite questionnaires, and so these outcome data may be ‘missing not at random’ (MNAR). Since true missing data mechanism is unknown, it is important, in case of the data at hand, to examine whether the study results are robust to alternative assumptions about the missing data.

The US National Research Council (NRC) report on missing data in clinical trials recommends sensitivity analyses that recognize the data to be MNAR, in line with general methodological guidance for dealing with missing data, and previous specific advice for intention-to-treat analysis in RCTs. However, systematic reviews report that in practice RCTs do not handle missing data appropriately. A simple approach to sensitivity analysis is to include in the statistical model parameters representing outcome differences between individuals with complete versus missing data and explore how inference vary as these ‘sensitivity parameters’ take on specific values. The output, i.e. results and conclusion, can then be compared over a reasonable range of values, possibly including a ‘tipping-point’ at which the results change. However, this approach does have a set of drawbacks.

An alternative is to allow experts to quantify their views, rather than those of others. Not only is this likely to be more intuitive and attractive for them, but it considers a fully Bayesian approach and properly captures and reflects expert opinion (and associated uncertainty) about the missing data in the posterior estimate of the treatment effect and its credible interval. This is particularly useful for those needing a quantitative summary of the trial, such as systematic reviewers, decision makers and health providers, because it provides a quantitative summary of how experts involved in the study would interpret its results given the missing data. When reviewing the study, experts will implicitly ‘fill in’ the gaps created by the missing data to arrive at their conclusions. The proposed elicitation approach coupled with a Bayesian analysis allows the study to coherently quantify the impact of incorporating expert knowledge about the missing data, through to the estimates of treatment effectiveness.

Sensitivity analyses using Bayesian approach require practical tools to facilitate expert elicitation, and recent research focuses on elicitation approaches within group meetings. Group level elicitation has advantages for training and clarification and facilitates behavioral aggregation for achieving consensus. However, because of the ‘feedback’ loop, these approaches are costly in both money and time, and in many RCTs, it may be infeasible to elicit opinion from a sufficient number and range of experts. To improve the uptake of recommended approaches to sensitivity analysis for missing data within RCTs requires that more accessible, practical tools for eliciting and synthesizing expert opinion are developed and exemplified.

Using open source software which can be administered face-to-face or online, to elicit beliefs from reasonably large numbers of experts without imposing an undue burden is one option that’s been recently suggested. With this tool, the elicited views can be converted into informative priors for the sensitivity parameters in a pattern-mixture model, allowing for correlation in the elicited values across the trial arms. Trial data can then be re-analyzed under different MNAR assumptions to explore the robustness of the results. The approach could be used at the design stage, utilizing either previously collected priors or new priors elicited from the trial team. Combining these with the expected level of loss to follow-up could provide an improved estimate of the likely impact of missing data on the trial’s results. Hence, this approach could help improve trial design, so that the study results are more robust to anticipated levels of missing data.

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Leave a Reply

Your email address will not be published. Required fields are marked *