Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen
is the sample variance
The test based on a fixed significance level α, say α = 0.05, has the disadvantage that it gives the decision maker no idea about whether the observed value of the test statistic is just barely in the rejection region or if it is far into the region. Instead, the p-value can be used to indicate how strong the evidence is in rejecting the null hypothesis H0. The p-value is the probability that the test statistic will take on a value that is at least as extreme as the observed value when the null hypothesis is true. The smaller the p-value, the stronger the evidence we have in rejecting H0. If the p-value is smaller than α, H0 will be rejected at the significance level of α. The p-value based on the t statistic in (3.18) can be found as
where T(n − 1) denotes a random variable following a t distribution with n − 1 degrees of freedom.
We can define the 100(1 − α)% confidence interval for μ as
It is easy to see that the null hypothesis H0 is not rejected at level α if and only if μ0 is in the 100(1 − α)% confidence interval for μ. So the confidence interval consists of all those “plausible” values of μ0 that would not be rejected by the test of H0 at level α.
To see the link to the test statistic used for a multivariate normal distribution, we consider an equivalent rule to reject H0, which is based on the square of the t statistic:
We reject H0 at significance level α if t2>(tα/2,n−1)2.
For a multivariate distribution with unknown mean μ and known Σ, we consider testing the following hypotheses:
Let X1, X2,…, Xn denote a random sample from a multivariate normal population. The test statistic in (3.19) can be naturally generalized to the multivariate distribution as
where X̄ and S are the sample mean vector and the sample covariance matrix of X1, X2,…, Xn. The T2 statistic in (3.19) is called Hotelling’s T2 in honor of Harold Hotelling who first obtained its distribution. Assuming H0 is true, we have the following result about the distribution of the T2-statistic:
where Fp,n−p denotes the F-distribution with p and n − p degrees of freedom. Based on the results on the distribution of T2, we reject H0 at the significance level of α if
where Fp,n−p denotes the upper (100α)th percentile of the F-distribution with p and n − p degrees of freedom. The p-value of the test based on the T2-statistic is
where F(p,n − p) denotes a random variable distributed as Fp,n−p.
The T2 statistic can also be written as