Real World Health Care Data Analysis. Uwe Siebert

Real World Health Care Data Analysis

missed paid work to help your care in last 12 monthsUnPdCaregiverHave you used an unpaid caregiver in last 12 monthsPdCaregiverHave you hired a caregiver in last 12 monthsDisabilityHave you received disability income in last 12 monthsSymDurDuration (in years) of symptomsDxDurTime (in years) since initial DxTrtDurTime (in years) since initial TrtmntSatisfCare_BSatisfaction with Overall Fibro Treatment over past monthBPIPain_BBPI Pain score at BaselineBPIInterf_BBPI Interference score at BaselinePHQ8_BPHQ8 total score at BaselinePhysicalSymp_BPHQ 15 total score at BaselineFIQ_BFIQ Total Score at BaselineGAD7_BGAD7 total score at BaselineMFIpf_BMFI Physical Fatigue at BaselineMFImf_BMFI Mental Fatigue at BaselineCPFQ_BCPFQ Total Score at BaselineISIX_BISIX total score at BaselineSDS_BSDS total score at BaselineBPIPain_LOCFBPI Pain score LOCFBPIInterf_LOCFBPI Interference score LOCF

3.5.2 Simulated PCI Data

The objective in simulating a new PCI data set from the observational data was primarily to produce a larger data set allowing us to more effectively illustrate the unsupervised, nonparametric Local Control alternative to conventional propensity score stratification (Chapter 7) and machine learning methods (Chapter 15). Starting from the observational data on 996 patients who received their initial PCI at Ohio Heart Health, Lindner Center, Christ Hospital, Cincinnati (Kereiakes et al, 2000), we generated this much larger data set via plasmode simulation. The simulated data set contains 11 variables on 15,487 patients with no missing values and is referred to as the PCI15K simulated data set. The key variables in the data set are described in Table 3.6. The treatment cohort for later analyses is represented by the variable THIN and the outcomes by SURV6MO (binary) and CARDCOST (continuous). As details of a process for generating simulated data was described for the REFLECTIONS example, only a brief summary and listing of the final simulated dataset variables are provided for the PCK15K dataset.

Table 3.6: PCI Simulated Data Set Variables

Variable Name	Variable Label
patid	Patient ID number: 1 to 15487
surv6mo	Binary PCI Survival variable: 1 => survival for at least six months following PCI, 0 => survival for less than six months
cardcost	Cardiac related costs incurred within six months of patient’s initial PCI; numerical values in 1998 dollars; costs were truncated by death for the 404 patients with surv6mo = 0
thin	Numeric treatment selection indicator: thin = 0 implies usual PCI care alone; thin = 1 implies usual PCI care augmented by either planned or rescue treatment with the new blood thinning agent
stent	Coronary stent deployment; numeric, with 1 meaning YES and 0 meaning NO
height	Height in centimeters; numeric integer from 133 to 198
female	Female gender; numeric, with 1 meaning YES and 0 meaning NO
diabetic	Diabetes mellitus diagnosis; numeric, with 1 meaning YES and 0 meaning NO
acutemi	Acute myocardial infarction within the previous 7 days; numeric, with 1 meaning YES and 0 meaning NO
ejfract	Left ejection fraction; numeric value from 17 percent to 77 percent
ves1proc	Number of vessels involved in the patient’s initial PCI procedure; numeric integer from 0 to 5

Tables 3.7 and 3.8 summarize the outcome data from the original data and the simulated Lindner data. Data are similar with slightly narrower group differences in the simulated data. In Chapters 7, 14, and 15, the PCI simulated data set is used for analysis and is named PCI15K.

Table 3.7: Lindner STUDY (Kereiakes et al. 2000)

	Patients	Number Surviving Six Months	Percent Surviving Six Months	Average Cardiac Related Cost
Trtm = 0	298	283	94.97%	$14,614
Trtm = 1	698	687	98.42%	$16,127

Table 3.8: PCI Blood Thinner Simulation

	Patients	Number Surviving Six Months	Percent Surviving Six Months	Average Cardiac Related Cost
Thin = 0	8476	8158	96.25%	$15,343
Thin = 1	7011	6925	98.77%	$15,643

3.6 Summary

In this chapter, two observational studies were introduced: the REFLECTIONS one-year study of patients with fibromyalgia and the Lindner study of patients undergoing PCI. The concept of plasmode simulations, where one builds a simulated data set that retains the same variables and correlation structure as the original data, was introduced and applied to the REFLECTIONS and Lindner data sets. SAS IML code for the application to the REFLECTIONS data was provided and was demonstrated to retain the similarities of the original data. These two data sets (simulated REFLECTIONS and PCI15K) are used throughout the remainder of the book to demonstrate the various methods for real world data analyses demonstrated in each chapter.

References

Austin P (2008). Goodness-of-fit Diagnostics for the Propensity Score Model When Estimating Treatment Effects Using Covariate Adjustment With the Propensity Score. Pharmacoepi & Drug Safety 17: 1202-1217.

Conover WG and Iman RL (1976). Rank Transformations in Discriminant Analysis.

Franklin JM, Schneeweis S, Polinski JM, Rassen J (2014). Plasmode simulation for the evaluation of pharacoepidemiologic methods in complex healthcare databases. Comput Stat Data Anal 72:219-226.

Gadbury GL, Xiang Q, Yang L, Barnes S, Page GP, Allison DB (2008). Evaluating Statistical Methods Using Plasmode Data Sets in the Age of Massive Public Databases: An Illustration Using False Discovery Rates. PLoS Genet 4(6): e1000098.

Kereiakes DJ, Obenchain RL, Barber BL, Smith A, McDonald M, Broderick TM, Runyon JP, Shimshak TM, Schneider JF, Hattemer CH, Roth EM, Whang DD, Cocks DL, Abbottsmith CW (2000). Abciximab provides cost effective survival advantage in high volume interventional practice. American Heart J 140: 603-610.

Peng X, Robinson RL, Mease P, Kroenke K, Williams DA, Chen Y, Faries D, Wohlreich M, McCarberg B, Hann D (2015). Long-Term Evaluation of Opioid Treatment in Fibromyalgia. Clin J Pain 31: 7-13.

Robinson RL, Kroenke K, Mease P, Williams DA, Chen Y, D’Souza D, Wohlreich M, McCarberg B (2012). Burden of Illness and Treatment Patterns for Patients with Fibromyalgia. Pain Medicine 13:1366-1376.

Wicklin R (2013). Simulating Data with SAS®. Cary, NC: SAS Institute Inc.

Chapter 4: The Propensity Score

4.1 Introduction

4.2 Estimate Propensity Score

4.2.1 Selection of Covariates

4.2.2 Address Missing Covariates Values in Estimating Propensity Score

4.2.3 Selection of Propensity Score Estimation Model

A Priori Logistic Regression Model

Automatic Parametric Model Selection

Nonparametric Models

4.2.4

Скачать книгу