Applied Univariate, Bivariate, and Multivariate Statistics Using Python. Daniel J. Denis

Applied Univariate, Bivariate, and Multivariate Statistics Using Python

in our case will state that the number of individuals surviving in the control group will be the same as that in the experimental group after 30 days from the start of the experiment. Key to this is understanding that the null hypothesis is about population parameters, not sample statistics. If the drug is not working, we would expect, under the most ideal of conditions, the same survival rates in each condition in the population under the null hypothesis. The null hypothesis in this case happens to specify a difference of zero; however, it should be noted that the null hypothesis does not always need to be about zero effect. The “null” in “null hypothesis” means it is the hypothesis to be nullified by the statistical test. Having set up our null, we then hypothesize a statement contrary to the null, known as the alternative hypothesis. The alternative hypothesis is generally of two types. The first is the statistical alternative hypothesis, which is essentially and quite simply a statement of the complement to the null hypothesis. That is, it is a statement of “not the null.” Hence, if the null hypothesis is rejected, the statistical alternative hypothesis is automatically inferred. For our data, suppose after 30 days, the number of people surviving in the experimental group is equal to 50, while the number of people surviving in the control group is 20. Under the null hypothesis, we would have expected these survival rates to be equal. However, we have observed a difference in our sample. Since it is merely sample data, we are not really interested in this particular result specifically. Rather, we are interested in answering the following question:

What is the probability of observing a difference such as we have observed in our sample if the true difference in the population is equal to 0?

The above is the key question that repeats itself in one form or another in virtually every evaluation of a null hypothesis. That is, state a value for a parameter, then evaluate the probability of the sample result obtained in light of the null hypothesis. You might see where the argument goes from here. If the probability of the sample result is relatively high under the null, then we have no reason to reject the null hypothesis in favor of the statistical alternative. However, if the probability of the sample result is low under the null, then we take this as evidence that the null hypothesis may be false. We do not know if it is false, but we reject it because of the implausibility of the data in light of it. A rejection of the null hypothesis does not necessarily mean the null is false. What it does mean is that we will act as though it is false or potentially make scientific decisions based on its presumed falsity. Whether it is actually false or not usually remains an unknown in many cases.

For our example, if the number of people surviving in each group in our sample were equal to 50 spot on, then we definitely would not have evidence to reject the null hypothesis. Why not? Because a sample result of 50 and 50 lines up exactly with what we would expect under the null hypothesis. That is, it lines up perfectly with expectation under the null model. However, if the numbers turned up as they did earlier, 50 vs. 20, and we found the probability of this result to be rather small under the null, then it could be taken as evidence to possibly reject the null hypothesis and infer the alternative that the survival rates in each group are not the same. This is where the substantive or research alternative hypothesis comes in. Why were the survival rates found to be different? For our example, this is an easy one. If we did our experiment properly, it is hopefully due to the treatment. However, had we not performed a rigorous experimental design, then concluding the substantive or research hypothesis becomes much more difficult. That is, simply because you are able to reject a null hypothesis does not in itself lend credit to the substantive alternative hypothesis of your wishes and dreams. The substantive alternative hypothesis should naturally drop out or be a natural consequence of the rigorous approach and controls implemented for the experiment. If it does not, then drawing a substantive conclusion becomes very much more difficult if not impossible. This is one reason why drawing conclusions from correlational research can be exceedingly difficult, if not impossible. If you do not have a bullet-proof experimental design, then logically it becomes nearly impossible to know why the null was rejected. Even if you have a strong experimental design such conclusions are difficult under the best of circumstances, so if you do not have this level of rigor, you are in hot water when it comes to drawing strong conclusions. Many published research papers feature very little scientific support for purported scientific claims simply based on a rejection of a null hypothesis. This is due to many researchers not understanding or appreciating what a rejection of the null means (and what it does not mean). As we will discuss later in the book, rejecting a null hypothesis is, usually, and by itself, no big deal at all.

The goal of scientific research on a statistical level is generally to learn about population parameters. Since populations are usually quite large, scientists typically study statistics based on samples and make inferences toward the population based on these samples. Null hypothesis significance testing (NHST) involves putting forth a null hypothesis and then evaluating the probability of obtained sample evidence in light of that null. If the probability of such data occurring is relatively low under the null hypothesis, this provides evidence against the null and an inference toward the statistical alternative hypothesis. The substantive alternative hypothesis is the research reason for why the null was rejected and typically is known or hypothesized beforehand by the nature of the research design. If the research design is poor, it can prove exceedingly difficult or impossible to infer the correct research alternative. Experimental designs are usually preferred for this (and many other) reasons.

1.2 Statistics and Decision-Making

We have discussed thus far that a null hypothesis is typically rejected when the probability of observed data in the sample is relatively small under the posited null. For instance, with a simple example of 100 flips of a presumably fair coin, we would for certain reject the null hypothesis of fairness if we observed, for example, 98 heads. That is, the probability of observing 98 heads on 100 flips of a fair coin is very small. However, when we reject the null, we could be wrong. That is, rejecting fairness could be a mistake. Now, there is a very important distinction to make here. Rejecting the null hypothesis itself in this situation is likely to be a good decision. We have every reason to reject it based on the number of heads out of 100 flips. Obtaining 98 heads is more than enough statistical evidence in the sample to reject the null. However, as mentioned, a rejection of the null hypothesis does not necessarily mean the null hypothesis is false. All we have done is reject it. In other words, it is entirely possible that the coin is fair, but we simply observed an unlikely result. This is the problem with statistical inference, and that is, there is always a chance of being wrong in our decision to reject a null hypothesis and infer an alternative. That does not mean the rejection itself was wrong. It means simply that our decision may not turn out to be in our favor. In other words, we may not get a “lucky outcome.” We have to live with that risk of being wrong if we are to make virtually any decisions (such as leaving the house and crossing the street or going shopping during a pandemic).

The above is an extremely important distinction and cannot be emphasized enough. Many times, researchers (and others, especially media) evaluate decisions based not on the logic that went into them, but rather on outcomes. This is a philosophically faulty way of assessing the goodness of a decision, however. The goodness of the decision should be based on whether it was made based on solid and efficient decision-making principles that a rational agent would make under similar circumstances, not whether the outcome happened to accord with what we hoped to see. Again, sometimes we experience lucky outcomes, sometimes we do not, even when our decision-making criteria is “spot on” in both cases. This is what the art of decision-making is all about. The following are some examples of popular decision-making events and the actual outcome of the given decision:

The Iraq war beginning in 2003. Politics aside, a motivator for invasion was presumably whether or not Saddam Hussein possessed weapons of mass destruction. We know now that he apparently did not, and hence many have argued that the invasion

Скачать книгу