Individual Participant Data Meta-Analysis. Группа авторов
if all the outcomes, participant‐level covariates and, if relevant, interactions required for the analyses have been reported, if they are not defined consistently across trials it can be difficult to include or combine their results in aggregate data meta‐analysis in a meaningful way. At best, this could lead to findings that are difficult to interpret, and at worst, that are unreliable. If this is a cause for concern, IPD might be sought to allow standardisation of the variables in readiness for analysis (Section 4.5).
2.6.3 Are IPD Needed to Improve the Information Size?
A major motivation for meta‐analysis is to increase the statistical power over that for a single trial. However, meta‐analysis may still not be sufficient to answer a particular research question reliably, as it depends on the potential absolute information size available from all existing trials. Determining this potential absolute information size, and subsequently statistical power (Chapter 12),69 should be considered in advance, and depends on the nature of the research question. For example, when using meta‐analysis to examine the overall effect of a treatment for a binary or time‐to‐event outcome, the absolute information size depends on the number of trials potentially available for meta‐analysis, as well as the number of participants and events in these trials. Between‐trial heterogeneity is also important, though it is difficult to gauge in advance. When examining participant‐level treatment‐covariate interactions, the variability of covariate values in each trial also contributes toward the potential absolute information size (Chapter 12).70
If the potential absolute information size based on all trials and their participants is small, and so statistical power is low, any meta‐analysis will struggle to detect realistic and clinically meaningful effects of the treatment under investigation, regardless of whether aggregate data or IPD are ultimately used. In this situation, conducting a meta‐analysis based on IPD, in particular, may not be the best use of time, effort and resources,47 unless it is specifically needed to inform the rationale and design (e.g. sample size) of a new trial that is geared to increasing the absolute information size for a subsequent meta‐analysis (Chapter 12).71,72
The absolute information size of a meta‐analysis may differ depending on whether aggregate data or IPD can be obtained, as suitable aggregate data may not be available for all trials, and similarly IPD may not be obtainable for all trials. It has been shown that when the absolute information size represented by an aggregate data meta‐analysis is small, the overall results are less likely to agree with those of a corresponding IPD meta‐analysis project.47
Even when the absolute information size of the available aggregate data is large, and power considered adequate, if these data represent a small proportion of all eligible participants (for example, because aggregate data for particular trials, participants or outcomes are not reported, or the follow‐up is limited), then the relative information size will be small. Any meta‐analysis of such aggregate data would not only suffer from reduced precision, but could be potentially biased or otherwise unrepresentative.47 In this situation, if the collection of IPD were to bring about a substantial increase in the proportion of eligible trials, participants or events, thereby increasing the relative information size available for meta‐analysis, then the approach could add considerable value, and may also give very different results to an equivalent aggregate data meta‐analysis. For example, in the earlier example about the effects of ovarian oblation on survival in early breast cancer (Section 2.5), the collection of IPD brought about a substantial increase in the duration of follow‐up, and the consequent number of events compared to the aggregate data, increasing both reliability and precision.
In contrast, if the absolute information size of the aggregate data is large (and so power considered sufficient), and the relative information size is also large (i.e. it represents a high proportion of the total eligible participants or events available from all existing trials), a meta‐analysis of these aggregate data would be expected to provide a reliable estimate. However, the aggregate data may only be sufficient in some respects, but not others. For example, whilst the absolute and relative information size of aggregate data may be sufficient when focussing on overall (unadjusted) treatment effects,47 they may be low when considering other measures (estimands) of interest, such as conditional treatment effects (i.e. adjusted for prognostic factors; see Chapters 5 and 6), subgroup results and treatment‐covariate interactions (see Chapter 7), and time‐dependent treatment effects (e.g. non‐constant hazard ratios) and multiple time‐points (see Chapter 13). IPD may substantially increase the information size for estimating such nuanced measures, and so researchers need to decide if they are a priority. For example, if an aggregate data meta‐analysis has large absolute and relative information sizes, and shows an overall benefit of treatment, this might provide a strong motivation for collecting IPD to examine subgroup effects or time‐dependent effects. In contrast, if there is no evidence of an overall effect based on such aggregate data, then there might be less justification for going to the trouble of collecting IPD, unless other reasons might warrant it (Table 2.2).
2.6.4 Are IPD Needed to Improve the Quality of Analysis?
Even if the necessary aggregate data are available for a trial, it may have been obtained using an inappropriate analytic method. For example, a treatment effect measured by an odds ratio derived from a logistic regression analysis may have been reported for a trial, when a hazard ratio based on a Cox regression would have been more appropriate, due to the time‐to‐event nature of the data; or a cluster randomised trial may have been analysed without accounting for the clustering. Another issue is that trial analyses may not provide the estimate of interest. For example, when conditional treatment effects are of interest, a trial may have derived treatment effect estimates without adjustment for key prognostic factors (e.g. if continuous outcome values at the end of follow‐up were analysed without adjustment for the continuous outcome values at baseline). Analyses may also not be sufficiently comprehensive; for example, a Cox regression model may have been fitted without examining the proportional hazards assumption for the treatment variable, even when non‐constant hazard ratios are a concern (e.g. from overlapping Kaplan‐Meier curves), or missing data may not have been handled appropriately (Chapter 18).
Analyses may also be inconsistent across trials. For example, even in situations where estimates of treatment‐covariate interactions are reported for each trial, they may differ in their definitions of categorical covariates, handling of continuous covariates (e.g. age might be dichotomised in some trials, but not in others) and the assumed relationships (e.g. linear or non‐linear trends).
All of these issues give rise to concern that a meta‐analysis based on aggregate data would not be robust or adequate for answering the research question of interest, and that an IPD meta‐analysis would be more comprehensive and flexible. Again, this can be determined only by first appraising the individual trial analyses, and evaluating which methods trial investigators used and the extent of aggregate data available.
2.7 Concluding Remarks
While there are many similarities between IPD and aggregate data meta‐analysis projects, they differ substantially in collaborative, data management and analytical aspects. If done well,