Evidence-Based Statistics. Peter M. B. Cahusac

Evidence-Based Statistics - Peter M. B. Cahusac


Скачать книгу
use of verbal labels such as ‘large’ or ‘small’ can sometimes be misleading [33]. What may be considered a large effect in one area (e.g. epidemiology) may be considered small in another (e.g. a drug treatment for hypertension). A popular standardized measure of effect size for a difference in means is d. This is actually Hedges' standardized statistic using the sample standard deviation SD rather than Cohen's using the population parameter σ.4

      (1.1)equation

      The relative effect sizes using d can be described as:

d Description
0.2 Small
0.5 Medium
0.8 Large
1.3 Very large

      A more general measure is provided by the correlation coefficient r. However, the transform between r and d is not linear since r is restricted to −1 and 1, while d varies between negative and positive infinity. For example a medium effect r of 0.3 corresponds to a d of 0.63 (on the large side), and a large r of .5 corresponds to a very large effect in d of 1.15. Using d allows us to relate more naturally to the measurements that are made.

Graph depicting the effect size versus sample size for the 95% confidence intervals around means for two sets of data. For each interval, the same standard deviation is used and the same p value is obtained for the mean’s difference from 0.

      Giving decimal places during calculations is tricky. The decimal places given for values in the text are usually given to an accuracy that allows one to check formulae and equations, often given in stages. Occasionally, there will be mismatches with the final answer which will be based on the most accurate calculation possible. These can usually be checked from the raw data using Excel or R.

      The support S will generally be expressed to only one decimal place. The use of S is merely a guide to the strength of evidence. It is graded rather than thresholded.

      1 Choose a parameter value for primary hypothesis H1. Either a value corresponding to practical importance, of minimum importance, or the expected value. Else use a medium effect size, e.g. d = ±0.5. Alternatively, use the MLE.

      2 Choose a secondary hypothesis H2 to compare with H1. Often this is the null hypothesis H0.

      3 Calculate S12, S10 for H0, or SM for MLE.

      4 Assess the relative evidence for the two hypotheses on the graded scale from −∞ to +∞.

      5 Always use likelihood intervals, typically for S-2 and S-3. Likelihood intervals are more flexible and may be more informative than examining S for particular hypotheses.

      6 If possible and convenient, plot the likelihood function.

A flow diagram illustrating the general procedure of calculating and assessing evidence. At the top, we start with defining hypotheses of interest. The primary hypothesis H1 is that specified by an effect size or the sample statistic (maximum likelihood estimate [MLE]). The secondary hypothesis H2 specifies another value of interest, often this is the null hypothesis.

      1 1 Taper ML, Lele SR, editors. The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations. Chicago: University of Chicago Press; 2004.

      2 2 Pearson ES. ‘Student’ as statistician. Biometrika. 1939; 30 (3/4):210–50.

      3 3 Edwards AWF. Likelihood. Baltimore: John Hopkins University Press; 1992.

      4 4 Royall RM. Statistical Evidence: A Likelihood Paradigm. London: Chapman & Hall; 1997.

      5 5 Hacking I. Logic of Statistical Inference. Cambridge: Cambridge University Press; 1965.

      6 6 Dienes Z. Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference. Basingstoke: Palgrave MacMillan; 2008.

      7 7 Baguley T. Serious


Скачать книгу