Solving Hardy-Weinberg with 3 Alleles
Introduction and Background
In biology, a population of organisms of the same species is said to be in "Hardy-Weinberg equilibrium" if the allele and genotype frequencies remain stable over generations. Hardy-Weinberg equilibrium assumes all of the following conditions are met:
- Mating is random. That is, individuals of a genotype do not a have a preference for or aversion to mating with individuals of certain other genotypes
- There is no mutation.
- There is no gene flow into or out of the population.
- There is no genetic drift. Genetic drift is random change in gene frequencies due to random sampling when individuals produce gametes and reproduce.
- There is no natural selection.
- There is no meiotic drive. Meiotic drive is asymmetry in meiosis, when certain genes are more likely to end up in the gametes.
In other words, a population in Hardy-Weinberg equilibrium is not evolving. The Hardy-Weinberg Principle gives rise to the Hardy-Weinberg equation for a gene with two alleles B and b. For a population in H-W equilibrium, If the frequency of allele B in the population is f(B) = p, and the frequency of allele b in the population is f(b) = q, with p + q = 1, then the frequencies of the genotypes are given by the expansion of the binomial
1 = (p + q)^2 = p^2 + 2pq + q^2
That is, the frequency of the genotypes BB, Bb, and bb are f(BB) = p^2, f(Bb) = 2pq, and f(bb) = q^2.
For a gene that has three alleles B, b, and c, let f(B) = p, f(b) = q, and f(c) = r, with p + q + r = 1. Then the H-W equation for three alleles is
1 = (p + q + r)^2 = p^2 + q^2 + r^2 + 2pq + 2pr + 2qr
This means the frequencies of the genotypes are f(BB) = p^2, f(bb) = q^2, f(cc) = ^2, f(Bb) = 2pq, f(Bc) = 2pr, and f(bc) = 2qr. You can use the three-allele version of the H-W equation to test if a gene with three alleles is in H-W equilibrium.
Example Part 1
In a population of snails, the gene for shell color has three alleles T, t, and d, where T is the dominant allele, while t and d are co-recessive. Snails with the genotypes TT, Tt, or Td have a dark brown shell. Homozygous recessive tt snails have a red shell, homozygous recessive dd snails have a yellow shell, and heterozygous recessive td snails have a light brown shell.
In a sample of 225 snails you count 25 red snails and 16 yellow snails. Under the assumption that the population is in Hardy-Weinberg equilibrium, what are the frequencies of the three alleles T, t, and d, and what are the frequencies of the six genotypes TT, Tt, Td, tt, td, and dd?
All red snails and only red snails have the genotype tt, so we start by computing the frequency of allele t, which we will call f(t) = q. Since the frequency of the genotype tt is f(tt) = 25/225 = q^2, we solve the equation 25/225 = q^2. This gives us q = 1/3 = 0.333.
Next, we use the fact that all yellow snails and only yellow snails have the genotype dd. We let f(d) = r be the frequency of allele d. We know that the frequency of the genotype dd is f(dd) = 16/225 = r^2. Solving this for r gives us r = 4/15 = 0.267.
Now we let f(T) = p be frequency of allele T. Since p + q + r = 1, we have p = 2/5 = 0.400.
With the frequencies of the three alleles taken care of, we can now find the assumed frequencies of the six genotypes. Using the Hardy-Weinberg equation for three alleles, we have (p + q + r)^2 = p^2 + q^2 + r^2 + 2pq + 2pr + 2qr. This means the assumed frequencies of the six alleles are
f(TT) = p^2 = (2/5)^2 = 0.160
f(Tt) = 2pq = 2(2/5)(1/3) = 0.267
f(Td) = 2pr = 2(2/5)(4/15) = 0.213
f(tt) = q^2 = (1/3)(1/3) = 0.111
f(td) = 2qr = 2(1/3)(4/15) = 0.178
f(dd) = r^2 = (4/15)^2 = 0.071
Example Part 2
You take the 225 snails back to the lab and test their genes to discover the true frequencies of the shell color genotypes in this sample. You find that the 225 snails are distributed among the six genotypes as follows:
TT = 42 snails
Tt = 73 snails
Td = 52 snails
tt = 25 snails
td = 17 snails
dd = 16 snails
Color-wise, there are 167 with a dark brown shell, 25 with a red shell, 17 with a light brown shell,and 16 with a yellow shell. The frequencies of the genotypes in this sample are
f(TT) = 42/225 = 0.187
f(Tt) = 73/225 = 0.324
f(Td) = 52/225 = 0.231
f(tt) = 25/225 = 0.111
f(td) = 17/225 = 0.076
f(dd) = 16/225 = 0.071
What are the frequencies of the alleles T, t, and d? For this question we need to solve backwards from the three-allele H-W equation. Since the population of 225 snails is diploid, they together have 2*225 = 450 copies of the three alleles. Of these 450 alleles, the proportions of T, t, and d are given by the formulas
f(T) = [2*TT + Td + Td]/450
f(t) = [2*tt + Tt + td]/450
f(d) = [2*dd + Td + td]/450
Thus, the frequencies of T, t, and d in this sample are
f(T) = [2*42+73+52]/450 = 0.4644 = p
f(t) = [2*25 + 73 + 17]/450 = 0.3111 = q
f(d) = [2*16 + 52 + 17]/450 = 0.2244 = r
Example Part 3
Given that the allele frequencies in the sample are f(T) = 0.4644, f(t) = 0.3111, and f(d) = 0.2244, and given that the observed genotype numbers are
TT = 42
Tt = 73
Td = 52
tt = 25
td = 17
dd = 16
the natural question to ask at this point is "Is this population in Hardy-Weinberg equilibrium?" In other words, do the observed genotype frequencies match the theoretically expected H-W equilibrium frequencies? To answer this, we need to compute the expected frequencies using the three-allele equation (p+q+r)^2 = p^2 + q^2 + r^2 + 2pq + 2pr + 2qr.
f_expected(TT) = p^2 = 0.4644^2 = 0.216
f_expected(Tt) = 2pq = 2*0.4644*0.3111 = 0.289
f_expected(Td) = 2pr = 2*0.4644*0.2244 = 0.208
f_expected(tt) = q^2 = 0.3111^2 = 0.097
f_expected(td) = 2qr = 2*0.3111*0.2244 = 0.140
f_expected(dd) = r^2 = 0.2244^2 = 0.050
The expected numbers of each genotype are
ExpNum(TT) = 0.216*225 = 48.6
ExpNum(Tt) = 0.289*225 = 65.0
ExpNum(Td) = 0.208*225 = 46.9
ExpNum(tt) = 0.097*225 = 21.8
ExpNum(td) = 0.140*225 = 31.4
ExpNum(dd) = 0.050*225 = 11.3
The expected numbers do not match the observed numbers. Is the discrepancy in this set of frequencies due to chance and/or sampling error, or is it due to the fact that the population is not in H-W equilibrium? To answer this question, we need to use Pearson's Chi-Squared Test. We compute the test statistic with the equation
χ2 = Σ [(Exp - Obs)2 / Exp]
for all six pairs of expected and observed numbers (not the frequencies, but the actual numbers). The number of degrees of freedom is the number of genotypes minus the number of alleles, or 6 - 3 = 3, Normally with six sets of observations the degrees of freedom would be 6 - 1 = 5, but with genotypes and alleles there is more interdependence among the variables. Calculating χ2 with the summation formula or a calculator gives us
χ2 = 11.46
Looking up this in a table of values for the chi-squared distribution with three degrees of freedom, we have P(11.46) ≤ 0.01, which means there is less than a 1% chance the discrepancy is due to chance or sampling error. Therefore we can reject the assumption that the population is in H-W equilibrium. In other words, there are evolutionary forces driving the genotype frequencies away from H-W equilibrium values.
The biggest difference is between the observed and expected numbers of light brown-shelled snails, those with genotype td. One possible explanation is that the light brown-shelled snails are being selected against by experiencing higher levels of predation. Another explanation is that red-shelled and yellow-shelled snails prefer not to mate with each other, which would drive down the number of td offspring.