Population Genetics. Matthew B. Hamilton
Problem 2.1 gives a 10‐locus genotype for the same individual in Table 2.2, allowing you to calculate the odds ratio for a realistic example. In Chapter 4, we will reconsider the expected frequency of a DNA profile with the added complication of allele frequency differentiation among human racial groups.
Problem box 2.1 The expected genotype frequency for a DNA profile
Calculate the expected genotype frequency and odds ratio for the 10‐locus DNA profile below. Allele frequencies are given in Table 2.3.
D3S1358 | 17, 18 |
vWA | 17, 17 |
FGA | 24, 25 |
Amelogenin | X, Y |
D8S1179 | 13, 14 |
D21S11 | 29, 30 |
D18S51 | 18, 18 |
D5S818 | 12, 13 |
D13S317 | 9, 12 |
D7S820 | 11, 12 |
What does the amelogenin locus tell us and how did you assign an expected frequency to the observed genotype? Is it likely that two unrelated individuals would share this 10‐locus genotype by chance? For this genotype, would a match between a crime scene sample and a suspect be convincing evidence that the person was present at the crime scene?
Testing Hardy–Weinberg expected genotype frequencies
A common use of Hardy–Weinberg expectations is to test for deviations from its null model. Populations with genotype frequencies that do not fit Hardy–Weinberg expectations are evidence that one or more of the evolutionary processes embodied in the assumptions of Hardy–Weinberg are acting to determine genotype frequencies. Our null hypothesis is that genotype frequencies meet Hardy–Weinberg expectations within some degree of estimation error. Genotype frequencies that are not close to Hardy–Weinberg expectations allow us to reject this null hypothesis. The processes in the list of assumptions then become possible alternative hypotheses to explain observed genotype frequencies. In this section, we will work through a hypothesis test for Hardy–Weinberg equilibrium.
The first example uses observed genotypes for the MN blood group, a single locus in humans that has two alleles (Table 2.4). First, we need to estimate the frequency of the M allele, using the notation that the estimated frequency of M is
(2.5)
(2.6)
Since
Using these allele frequencies allows calculation of the Hardy–Weinberg expected genotype frequency and number of individuals with each genotype, as shown in Table 2.4. In Table 2.4, we can see that the match between the observed and expected is not perfect, but we need some method to ask whether the difference is actually large enough to conclude that Hardy–Weinberg equilibrium does not hold in the sample of 1066 genotypes. Remember that any allele frequency estimate
Box 2.1 DNA profiling
The loci used for human DNA profiling are a general class of DNA sequence marker known as simple tandem repeat (STR), simple sequence repeat (SSR), or microsatellite loci. These loci feature tandemly repeated DNA sequences of one to six base pairs (bp) and often exhibit many alleles per locus and high levels of heterozygosity. Allelic states are simply the number of repeats present at the locus, which can be determined by electrophoresis of polymerase chain reaction (PCR) amplified DNA fragments. STR loci used in human DNA profiling generally exhibit Hardy–Weinberg expected genotype frequencies; there is evidence that the genotypes are selectively “neutral” (e.g. not affected by natural selection), and the loci meet the other assumptions of Hardy–Weinberg. STR loci are employed widely in population genetic studies and in genetic mapping (see reviews by Goldstein and Pollock 1997; McDonald and Potts 1997).
Figure 2.8 The original data for the DNA profile given in Table 2.2 and Problem Box2.1 obtained by capillary electrophoresis. The PCR oligonucleotide primers used to amplify each locus are labeled with a molecule that emits blue, green, or yellow light when exposed to laser light. Thus, the DNA fragments for each locus are identified by their label color as well as their size range in base pairs. Panel A shows a simulation of the DNA profile as it would appear on an electrophoretic gel (+ indicates the anode side). Blue, green, and yellow label the 10 DNA profiling loci, shown here in grayscale. The red DNA fragments are size standards with a known molecular weight used to estimate the size in base pairs of the other DNA fragments in the profile. Panel B shows the DNA profile for all loci and the size standard DNA fragments as a graph of color signal intensity by size of DNA fragment in base pairs. Panel C shows a simpler view of trace data for each label color independently with the individual loci labeled above the trace peaks. A few shorter peaks are visible in the yellow, green, and blue traces of Panel C that are not labeled as loci. These artifacts, called “pull up” peaks, are caused by intense signal from a locus labeled with