Statistics in Nutrition and Dietetics. Michael Nelson
course, we may just be interested in describing what is going on in physiological systems (what dietary factors are associated with low serum total cholesterol levels?) or in the population (are women aged 75 years and older at greater risk of osteoporosis‐related fracture of the hip if they have low levels of physical activity?) More often, we want to know if there is a causal relationship between these factors (does an increased level of physical activity protect against osteoporosis‐related hip fracture in women aged 75 and older?). Public health recommendations to improve nutrition and nutrition‐related outcomes need strong evidence of causality before they can be promoted to the general public. Confusion in the mind of the public is often caused by the media promoting a ‘miracle cure’ based on a single study (it makes good press but bad science). Food manufactures are often guilty of using weak evidence of causality or vague terms about ‘healthiness’ to promote sales of their products.10
BOX 1.4 Bradford Hill hierarchy of causality
Strength of association | Is the evidence linking exposure and outcome strong? We shall see what we mean by ‘strong’ as we explore the different statistical tests used to evaluate associations. |
Consistency of association across studies | Are the same associations seen repeatedly in different groups or across different populations in different places and times? |
Specificity | Is there a specific link between exposure and outcome? |
Temporal association | Does A precede B? Evidence needs to show that cause (A) is followed by consequence (B). As we shall see, A and B may be associated in a cross‐sectional analysis of data, but unless a clear time‐sequence can be established, the evidence for causality is weak. |
Dose‐response | Does increased exposure result in increased likelihood of the outcome? If fruit and vegetable consumption is protective against heart disease, can it be shown that the more fruit and vegetables are eaten, the lower the risk of disease? |
Plausible mechanism and coherence | Is there a clear physiological explanation for the observed link between A and B? What is it in fruit and vegetables that affect the factors that determine risk of heart disease? Does the new evidence fit in with what is already known? If not, why not? Are there any animal models that support evidence in humans? |
Experimental evidence | Does experimental evidence based on intervention studies support the argument for causation? Is the experimental evidence consistent across studies? |
Analogy | Are there related exposures or conditions that offer insight into the observed association? |
We have seen earlier that the logic used to support notions of causality may be inductive or deductive. Whichever logical model is used, no single study in nutrition will provide conclusive evidence of the relationship between A and B. There is a hierarchy of evidence, first set out clearly by Bradford Hill [4, 5], which suggests that a clear picture of causality can only be built from multiple pieces of evidence (Box 1.4). Published over 50 years ago, these criteria have withstood the test of time [6].
1.6.3 Types of Study Design
The summary below provides a brief overview of some of the types of study designs available. There are many more designs, of course, that address complex issues of multiple factors influencing multiple outcomes, with corresponding statistical analysis, but these are dealt with in more advanced textbooks on research design and analysis. The list below repeats some of the material covered in Section 1.2 on logic, but goes into more detail in relation to study design.
The principle aim is to conduct studies that are free from bias and based on relevant measures of exposure and outcome so that the hypothesis can be tested effectively.
Observational Studies
Observational studies usually focus on the characteristics or distribution of phenomena in the population that you are investigating. Such studies may analyze data at one point in time or explore time trends in the relevant variables. They may be based on observations of individuals within a sample, or they may consider the relationship between variables observed in groups of subjects (for example, differences in diet and disease rate between countries). They are often the basis for hypothesis generating, rather than hypothesis testing.
Case studies are reports of potentially generalizable or particularly interesting phenomena. Individually, a case study cannot provide evidence that will help you to establish the truth of your hypothesis. Consistent findings across several case studies may provide support for an idea, but cannot be used in themselves to test a hypothesis.
Descriptive studies are careful analyses of the distribution of phenomena within or between groups, or a study of relationships existing between two or more variables within a sample. Descriptive studies are often well suited to qualitative examination of a problem (e.g. examining the coping strategies used by families on low income to ensure an adequate diet for their children when other demands [like the gas bill] are competing for limited cash). But of course they also provide descriptions of quantitative observations for single variables (e.g. how much money is spent on food, fuel, etc. in families on low income), or multiple variables (e.g. how money is spent on food in relation to total income or family size). Many epidemiological studies fall into this category (see below). They are useful for understanding the possible links between phenomena, but cannot in themselves demonstrate cause and effect.
Diagnostic studies establishing the extent of variation in disease states. They are helpful when selecting subjects for a study and deciding on which endpoints may be relevant when designing a study to explore cause and effect.
Experimental and Intervention Studies
These studies are designed to create differences in exposure to a factor which is believed to influence a particular outcome, for example, the effect of consuming oat bran on serum cholesterol levels, or the effect on birth weight of introducing an energy supplement during pregnancy. The aim is usually to analyze the differences in outcome associated with the variations in exposure which have been introduced, holding constant other factors which could also affect the outcome.
These types of studies are usually prospective or longitudinal in design. Alternatively, they may make use of existing data. Depending on how subjects are selected, they may use inductive or deductive logic to draw their conclusions (see Section 1.2)
Pre‐test–post‐test (Figure 1.1). This is the simplest (and weakest) of the prospective experimental designs. There is one sample. It may be ‘adventitious’ – subjects are selected as they become available (for example, a series of patients coming into a diabetic clinic, or a series of customers using a particular food shop); or it may be ‘purposive’ (the sample is drawn systematically from a population using techniques that support generalization of the findings). Each individual is measured at the start of the study (the ‘baseline’ or ‘time zero’). There is then an intervention. Each subject is measured again at the end of the study.