Individual Participant Data Meta-Analysis. Группа авторов
Figure 4.4 Excerpts from the website https://www.york.ac.uk/crd/research/epppic/ for an IPD meta‐analysis of the effects of progestogens for the prevention of preterm birth.
Source: Evaluating Progestogens for Prevention of Preterm birth International Collaborative, © University of York.
Although with a lesser degree of de‐identification or pseudonymisation there are risks that participants might be identified,99 this is mitigated by the data provider and IPD project’s central research team entering into a data‐sharing agreement. This should specifically prohibit any attempt to identify individuals, and stipulate that the team receiving the data should have data privacy training, and will hold the data securely and use it appropriately (Section 3.11). Data recipients still need to sign data‐sharing agreements to obtain IPD from more public sources, such as data‐sharing platforms or repositories (Section 3.2.2),100 but these data will usually have been subject to a greater degree of de‐identification, because the risks to participant confidentiality are greater with such quasi‐public data.99 Full de‐identification or anonymisation, involving the removal of all links between the de‐identified IPD and the original datasets, limits the utility of the IPD for meta‐analysis and therefore, is not recommended.
Table 4.3 Example of items to include in a data transfer guide when requesting IPD
Source: Jayne Tierney and Larysa Rydzewska.
Item | Example |
---|---|
Preferred file format | Provide electronic data file(s) in Microsoft Excel (.xls or .xlsx), Stata (.dta), SAS (.sas7bdat) or a delimited plain‐text (.csv) format if possible. If you need to use another format, please let us know. |
Filename instructions | Include a clear trial identifier in the name of any data files provided. |
Specific trial population | Include all participants recruited to your trial, including those later classed as being ineligible, withdrawn, not evaluable, with protocol deviations, or lost to follow‐up. |
De‐identification instructions | Do not include any codes or labels that could potentially directly identify participants, such as names, locations, address details, hospital numbers or dates of birth. If only direct identifiers have been used for a particular trial variable (e.g. the participant ID is the participant name), then please replace with a pseudonymised version for each participant and retain the key in case queries arise. |
Data items and preferred coding | Extract the variables from your trial database that correspond most closely to those requested in the data dictionary. If this is not possible for some or all data items, please use your own codes, but define them clearly in your dataset documentation. |
Secure data transfer instructions | Transfer the trial data file to us using our secure file transfer service. If the trial data file is transferred by email, ensure that it is secure and end‐to‐end encrypted. |
Research team contact details | If you have any questions regarding the preparation and transfer of the IPD, please contact Jayne Bloggs ([email protected]). |
4.4.2 Providing Data Transfer Guidance
It is useful to provide trial investigators, or their data contact, with a data transfer guide outlining how their IPD should be prepared and transferred securely (Table 4.3). In addition, the guide can be used to make data providers aware of how the central research team will subsequently manage, check and verify their data, and to encourage them to get in touch should any queries arise. It may also highlight whether there are any specific funds available to facilitate the preparation of trial data, and how to access these. If helpful, the guide can be accompanied by a template data file in the trial investigators preferred software package (such as Stata, R, SAS, or Excel), including all the variable names, but no actual data.
4.4.3 Transferring Trial IPD Securely
Based on the relevant data transfer guidance supplied by the central research team, secure data transfer can be achieved using a suitable file transfer site, for example, an institutional or suitable commercial file share service. Where a data file needs to be sent via email, ideally it should be end‐to‐end encrypted or otherwise password protected, with passwords sent separately (and ideally via a different communication medium, such as by post or by telephone). Although rare these days, if data need to be transferred using physical storage media, such as external disc drives or memory sticks, these should also be encrypted and sent using a delivery service that allows the package to be tracked and signed for upon delivery.
When data files are first received, it is important to ensure that they can be read and loaded into the preferred storage or analysis system before proceeding. For example, if a data file arrive electronically, it should be checked to ensure that any passwords provided work, and that it can be opened, is for the correct trial, and that the data have not been corrupted during transfer. It is important to thank relevant trial personnel for providing their data, and either confirm that they are as expected, or relay any issues that have arisen, as soon as possible.
4.4.4 Storing Trial IPD Securely
Once transferred, IPD need to be stored securely by the central research team, with access limited to those members responsible for data management, checking and analysis, or overall conduct. They should have appropriate expertise in the handling of participant data, as well as training in data protection, and also should be prohibited from copying data to any mobile devices, laptops, memory sticks or cloud servers that have not been set up for the secure storage of confidential data.
It is recommended that the central research team retain the original version of each trial data file exactly as supplied. This can be used for cross checking, and provides a back‐up should any errors arise, for example, if the data file that the meta‐analysis team are working on becomes corrupted. A copy of the file should be made prior to any changes, such as re‐formatting, re‐coding or re‐defining variables (Section 4.5), being carried out. Firstly, the copy should be converted to the preferred software format (such as Stata, R or SAS) for the IPD meta‐analysis project, which nowadays is a fairly straightforward process, because statistical packages can recognise or import data from multiple other sources, and if they do not, a specialist transfer package such as StatTransfer can be utilised. It is useful to add a numeric trial identifier and label to the trial data file, so that when IPD are being checked, harmonised, merged and analysed subsequently, each trial can be easily identified in the outputs, analyses and forest plots.
4.4.5 Making Best Use of IPD from Repositories