Business Experiments with R. B. D. McCullough

Business Experiments with R

in all respects except the number of newspapers per 1000 people. This sort of analysis leads into “the garden of forking paths,” a phrase used by the statistician Andrew Gelman (who specializes in causal inference) to describe the many decisions a researcher may take that can lead the researcher to unknowingly reaching spurious statistical conclusions.

For example, if an analyst analyzing the above data actually added variables and dropped variables until LN was insignificant and some other health‐related variables all were significant, she would have taken a trip through the garden of forking paths and would come up with a useless model. The model would be useless because she tested many hypotheses on the same set of data and her “results” almost assuredly are contaminated by false positives (i.e. type I errors): she thinks the coefficients are significant when they're really not.

To better illustrate this idea, let us have 10 covariates (independent variables) to use in building a model to describe a particular dependent variable, and we will be allowed to include anywhere from 1 to all 10 of the variables. For each variable there is a decision (fork) to include or exclude the variable. Then there are images possible models to choose from. A researcher just tries a sufficient number of models, dropping and including variables, until images at which point she freezes the model, and the choice of variables to include is justified after the variables have been included. Even a researcher who does not deliberately try all possible models still will make choices about including and excluding variables (“I thought X1 would be significant, but it wasn't, so I dropped it and tried X2”), which implies that her model is but one of many possible models that she just happened to select. Because of the garden of forking paths, a seemingly objective analysis is really quite subjective, and causality cannot be determined from subjective analyses.

The purpose of this example is to drive home the point that, in general, observational data simply are not up to the task of answering causal questions. In this book, we focus on an alternative approach to answering business questions, which is to conduct experiments.

We do not suggest that observational studies have no valid uses. To the contrary, there are many situations when experiments are not possible, and in such cases, there is no alternative to the use of observational data:

Sometimes it is impossible to run an experiment. For example, it would be unethical to randomly assign people to smoke versus not smoke, so our understanding of the causal relationship between smoking and cancer was built on observational data. (However, it took a long time to convince everyone, since observational data is easy to question.)

If you want to build a new store, it is foolish to construct several stores in random locations to test hypotheses about where to locate stores.

Establishing causality is not always necessary, and documenting correlations is sometimes sufficient for the purpose at hand. In fact, the whole field of “predictive analytics” focuses on prediction problems, where causality is not important. For example, if we are predicting defaults on mortgages, very often we only need to know the probability that a person will default, not the causal factors that determine the probability of default; correlation is sufficient, and causation is not necessary.

The outcome of interest is sufficiently rare that running an experiment with a large enough number of trials is expensive. Perhaps you can only afford a sample size of 100, but the response rate is 2%; you're never going to get a good estimate of the response rate with such a comparatively small sample.

Each trial is so expensive that even a small experiment is too expensive.

The population is too small to support an experiment. There is no point in remodeling a sample of stores to decide whether to remodel the entire population if the population is only 10 stores.

Hypothesis generation is an excellent use of observational data. The researcher explores the observational data looking for interesting ideas that might merit a follow‐up experiment.

Observational data can be used to shed light on causal questions, but the statistical machinery necessary to do so is very sophisticated and still not as good as actually running an experiment. These types of sophisticated analyses are usually performed when an experiment is impossible, not in lieu of an experiment. A standard reference for this type of analysis is Rosenbaum (2010). Using observational data to answer causal questions is hard; so we try to use experimental data, which is comparatively easy.

Notwithstanding the above, it is very common for people to use observational data to (mistakenly) make causal assertions outside of the narrow range in which it can be done. It is very common, therefore, for “causal” results from observational data to be later contradicted by experiments. Many instances can be found in the article by Young and Karr (2011) and the references therein. With respect to establishing causality, unless you are a very skilled statistician, analyzing observational data can only provide what researchers call hypothesis‐generating evidence. In other words, analyzing observational data can give you good ideas for experiments to conduct, but it can't tell you anything about causality.

It is important for the analyst to know whether she has an “umbrella problem” or a “rain dance problem.” If all she wants to know is whether or not she should carry an umbrella, then she has a pure prediction problem and causal questions are of secondary importance; she only needs to know whether the probability of rain is high or low. On the other hand, if there has been a long drought and she wants to end it, prediction is of little value: causal questions are of primary importance. If she wants to induce rainfall, she needs to know what variables cause rain and then try to manipulate those variables. We recall the experience of one analyst, trained in the field of predictive analytics, who had been given some observational data on donations from specific individuals and asked for insight as to whether or not attending fundraisers increased donations. She dropped the variable “number of fundraisers the person attended” because it had no predictive value. She just couldn't understand that her boss had given her a rain dance problem, not an umbrella problem.

Exercises

1 1.1.1 Find an example of an observational study. Answer the following questions: (i) What makes this an observational rather than an experimental study? (be specific) (ii) What is the purpose of the study? (iii) What is the primary variable of interest?

2 1.1.2 For each of the following, indicate whether the data are observational or experimental, and defend your answer.A broadcaster moves a popular television show from Tuesday to Thursday, and its viewership increases.A psychologist wants to know how often students take a break from studying. To do this, he installs cameras in the library's reading room for one day. He notes that students take more breaks in the evening.A sports manufacturer wonders if his quick‐dry exercise shirts dry more quickly than regular shirts. He finds some basketball players playing a game in the gym. He gives one team quick‐dry shirts, and the other team gets regular 100% cotton shirts.A child psychologist asks several parents whether their children play violent video games. He also asks how many times a week their children display violent behavior. He finds that children who play violent video games display more violent behavior.

3 1.1.3 Give two situations where experiments can't be conducted.

1.2 Case: Credit Card Defaults

You work for a credit card company, and you want to figure out which customers might default. In the credit.csv dataset are 30 000 observations on six variables: credit limit (how much can be charged on the credit card), sex of the cardholder, education level of the cardholder (high school, undergrad, grad, other), whether the cardholder is married (single, married, other), the age of the cardholder in years, and whether or not the cardholder defaulted (1 = default, 0 = non‐default).

In this problem we are confronted with the ultimate questions confronting all credit issuers: whether to grant credit to each potential customer and, if so, how much?

Скачать книгу