Real World Health Care Data Analysis. Uwe Siebert
use continued/started at this visit
For the REFLECTIONS simulated data set, simulation was performed separately for each treatment cohort. First, the original dataset was transformed from a vertical (one observation per patient per time-point) into a horizontal format (one record per patient). Next, a cohort-specific data set was created by random sampling (with replacement) from each original variable. The size of sample was 240, 140, and 620 for opioid, non-narcotic opioid, and other treatment cohort, respectively. The SAS/IML programming language was used to implement the Iman-Conover method following the code of Wicklin (2013) as shown in Program 3.1 using the sampled data (A) and the desired between variables rank-correlations (C).
Program 3.1: Iman-Conover Method to Create a Simulated REFLECTIONS Data Set
/* Use Iman-Conover method to generate MV data with known marginals
and known rank correlation. */
start ImanConoverTransform(Y, C);
X = Y;
N = nrow(X);
R = J(N, ncol(X));
/* compute scores of each column */
do i = 1 to ncol(X);
h = quantile(“Normal”, rank(X[,i])/(N+1));
R[,i] = h;
end;
/* these matrices are transposes of those in Iman & Conover */
Q = root(corr(R));
P = root(C);
S = solve(Q,P);
M = R*S; /* M has rank correlation close to target C */
/* reorder columns of X to have same ranks as M.
In Iman-Conover (1982), the matrix is called R_B. */
do i = 1 to ncol(M);
rank = rank(M[,i]);
tmp = X[,i];
call sort(tmp);
X[,i] = tmp[rank];
end;
return( X );
finish;
X = ImanConoverTransform(A, C);
The three cohort-specific simulated matrices (X) were concatenated and then the dropout and missing data were imposed at random in order to reflect the amount of dropout/missingness observed in the actual REFLECTIONS data. Then the structure of the simulated data was converted from horizontal to back to vertical.
The distributions of variables were almost identical for real and simulated data as displayed in Tables 3.3 and 3.4. This can be expected because the Iman-Conover algorithm simply rearranges the elements of columns of the data matrix. The descriptive statistics for real and simulated data are presented below.
Table 3.3: Comparison of Actual and Simulated REFLECTIONS Data for One Observation per Patient Variables
real | type | ||
real | simulated | ||
All | N | 1575 | 1000 |
Cohort | 13.65 | 14.00 | |
NN opioid | ColPctN | ||
opioid | ColPctN | 24.00 | 24.00 |
other | ColPctN | 62.35 | 62.00 |
Gender | 94.54 | 93.20 | |
female | ColPctN | ||
male | ColPctN | 5.46 | 6.80 |
Race | 83.62 | 82.30 | |
Caucasian | ColPctN | ||
Other | ColPctN | 16.38 | 17.70 |
Insurance | 78.10 | 75.70 | |
private/combination | ColPctN | ||
public/no insurance | ColPctN | 21.90 | 24.30 |
Doctor Specialty | 17.65 | 17.60 | |
Other Specialty | ColPctN | ||
Primary Care | ColPctN | 15.87 | 15.70 |
Rheumatology | ColPctN | 66.48 | 66.70 |
Exercise | 10.03 | 11.00 | |
No | ColPctN | ||
Yes | ColPctN | 89.97 | 89.00 |
Inpatient hospitalization in last 12 months | 89.84 | 90.70 | |
No | ColPctN | ||
Yes | ColPctN | 10.16 | 9.30 |
Other missed paid work to help your care in last 12 months | 77.71 | 79.60 | |
No | ColPctN | ||
Yes | ColPctN | 22.29 | 20.40 |
Have you used an unpaid caregiver in last 12 months | 62.86 | 60.50 | |
No | ColPctN | ||
Yes | ColPctN | 37.14 | 39.50 |
Have you hired a caregiver in last 12 months | 95.56 | 95.70 | |
No | ColPctN | ||
Yes | ColPctN | 4.44 | 4.30 |
Have you received disability income in last 12 months | 70.86 | 72.30 | |
No | ColPctN | ||
Yes | ColPctN | 29.14 | 27.70 |
Age in years | NMiss | 0 | 0 |
Mean | 50.45 | 50.12 | |
Std | 11.71 | 11.56 | |
BMI at Baseline | NMiss | 0 | 0 |
Mean | 31.30 | 31.36 | |
Std | 7.34 | 7.01 | |
Duration (in years) of symptoms | NMiss | 216 | 133 |
Mean | 10.28 | 10.03 | |
Std | 9.26 | 9.02 | |
Time (in years) since initial Dx | NMiss | 216 | 133 |
Mean | 5.73 | 5.29 | |
Std | 6.27 | 6.05 | |
Time (in years) since initial Trtmnt | NMiss | 216 | 133 |
Mean | 5.22 | 5.26 | |
Std | 6.02 | 6.18 | |
PHQ 15 total score at Baseline | NMiss | 0 | 0 |
Mean | 13.81 | 14.03 | |
Std | 4.64 | 4.79 | |
FIQ Total Score at Baseline | NMiss | 0 | 0 |
Mean | 54.54 | 54.56 | |
Std | 13.43 | 13.47 | |
GAD7 total score at Baseline | NMiss | 0 | 0 |
Mean | 10.81 | 10.64 | |
Std | 5.77 | 5.67 | |
MFI Physical Fatigue at Baseline | NMiss | 0 | 0 |
Mean | 13.09 | 13.00 | |
Std | 2.28 | 2.17 | |
MFI Mental Fatigue at Baseline | NMiss | 0 | 0 |
Mean | 11.51 | 11.52 | |
Std | 2.38 | 2.49 | |
CPFQ Total Score at Baseline | NMiss | 0 | 0 |
Mean | 26.51 | 26.62 | |
Std | 6.44 | 6.43 | |
ISIX total score at Baseline | NMiss | 0 | 0 |
Mean | 17.64 | 17.91 | |
Std | 5.97 | 5.74 | |
SDS total score at Baseline | NMiss | 0 | 0 |
Mean | 18.27 | 18.28 | |
Std | 7.50 | 7.56 |
Table 3.4: Comparison of Actual and Simulated REFLECTIONS Data for Visit-wise Variables
real | type | ||
real | simulated | ||
Visit | 1575 | 1000 | |
1 | N | ||
Opioids use | 76.00 | 76.00 | |
No | ColPctN | ||
Yes | ColPctN | 24.00 | 24.00 |
Satisfaction with Overall Fibro Treatment | 5.33 | 6.10 | |
. | ColPctN | ||
1 | ColPctN | 12.13 | 12.10 |
2 | ColPctN | 20.95 | 19.70 |
3 | ColPctN | 25.27 | 24.20 |
4 | ColPctN | 22.86 | 24.30 |
5 | ColPctN | 13.46 | 13.60 |
Satisfaction with Prescribed Medication | 10.03 | 9.80 | |
. | ColPctN | ||
1 | ColPctN | 7.43 | 6.80 |
2 | ColPctN | 15.81 | 15.60 |
3 | ColPctN | 31.68 | 31.90 |
4 | ColPctN | 23.75 | 24.30 |
5 | ColPctN | 11.30 | 11.60 |
PHQ8 |