The Hardy–Weinberg law, named after Wilhelm Weinberg (1862–1937) and Godfrey Harold Hardy (1877–1947), who introduced it in the same year (Hardy, Reference Hardy1908; Weinberg, Reference Weinberg1908), is included and explained in most genetics textbooks. Weinberg was a prominent member of the German ‘school’ of genetics in the first third of the 20th century, which also included Fritz Lenz and the mathematician Felix Bernstein (Baur et al., Reference Baur, Fischer and Lenz1931; Bernstein Reference Bernstein1925). Kallmann (Reference Kallmann1938) and Früh (Reference Früh1996) give accounts of Weinberg's lifework (see also Stark & Seneta, Reference Stark and Seneta2013). Crow (Reference Crow1999) contains a synthesis, and also writes on the Hardy–Weinberg setting. Hardy spent many years as a professor of mathematics at Cambridge and Oxford Universities (Edwards, Reference Edwards2008; Fletcher, Reference Fletcher1980). Fletcher's (Reference Fletcher1980) paper is the perspective of a mathematician on the involvement of a pure mathematician (Hardy) in a biological conundrum.
Several accounts have been given of how a geneticist, Punnett, asked mathematician Hardy to explain how a dominant trait would not eventually displace a recessive trait in a population. This was a simple problem to Hardy, who showed by straightforward algebra, which Diaconis (Reference Diaconis2002) calls ‘back of the envelope calculation’, that by assuming random mating (RM), genetic variability is maintained. Hardy did not notice that the assumption ignores a deeper mathematical fact, that variability is maintained by forms of non-random mating (NRM). In the interests of completing the ‘story’, we explain how this is so. As discussed below, Weinberg used the property of stability to explore the question of a possible genetic basis for the occurrence of twins in humans (Stark, Reference Stark2006a).
Several aspects of the law require examination and we follow the usual practice of abbreviating Hardy–Weinberg to HW, Hardy–Weinberg proportions to HWP, and Hardy–Weinberg equilibrium to HWE. By HWP, we shall mean the trio of frequencies (proportions), of form {q 2, 2pq, p 2}, for the corresponding genotypes {AA, AB, BB}. In the traditional (RM) setting of the HW law, q is the frequency of the first allele A and p that of the second B at an autosomal locus, but we shall use the notion of HWP more generally as defined. HWE relates to the connection between sets of HWP over successive generations and the possibility or otherwise, of their stationarity. We ignore the fact that there are no clearly distinct generations in human populations.
Misunderstandings arise because of the ways in which the phrase ‘RM’ and its alternative ‘NRM’ are used in the literature. Mathematicians and geneticists devise models — that is, mathematical constructs — as an aid to thought in exploring the properties of the models, with a view to planning experiments or making observations for tests of validity. In setting up a model, it is easy to propose RM as one of the assumptions; for example, to specify what proportion of couplings occurs between genotypes AA and BB. One simply says that this is the product of the frequencies of types AA and BB. But it is a step too far to project this to an actual population where it is hardly possible for such a state to apply — this would assume that mates would be chosen in the way that winners in a lottery are chosen. However, there are many instances in the literature of such facile thinking. It is very tempting to say, because a set of genotype frequencies are not significantly different from HWP, that mates are chosen randomly.
The standard textbook treatment, such as in Hartl and Jones (Reference Hartl and Jones2006), is to show that, starting from any distribution of genotype frequencies, that is of {AA, AB, BB}, HWP {q 2, 2pq, p 2} are achieved in one round of RM. It follows that if RM is continued, the same HWP will be maintained. We show later that other mating regimes achieve the same outcomes. The reason why this is important is that ‘pure’ RM is a highly restrictive assumption and the other models allow some relaxation of the conditions and perhaps help to explain why empirical data often contain distributions (sets of frequencies) close to HWP.
Li's (Reference Li1955) somewhat older textbook treats HW early and extensively. In fact, the outside hardcover's geometric design is motivated by a HW property. We shall use some of Li's (Reference Li1955) clear treatment in the sequel.
Table 1, which comes from Johannsen (Reference Johannsen1913), gives a concise derivation of the HW law in a form that avoids the misleading statements of some presentations. The notation has been modified slightly to conform to that used here.
ap + q = 1.
We do not attempt to give a comprehensive overview of the HW law, which has been done in capable fashion with much useful advice by Mayo (Reference Mayo2008). Our aim is to present what in some quarters may be seen as a contrary view, synthesizing much of the past work on NRM and HW by one of the present authors.
We show that Hardy's recourse to an identity relating to the distribution of types among offspring following RM, rather than an identity relating to the mating matrix, may be the reason why he did not realize that HWE can be reached and sustained with NRM. Others, such as Diaconis (Reference Diaconis2002), were similarly misled.
The sections that follow give, first, the standard derivation of the HW law, although in more modern format that clearly reveals the stages in the derivation. This permits generalization to incorporate NRM. Then follow sections on a numerical example of NRM leading to HWP, and a model of NRM sustaining HWP; a general model sustaining HWPs; whether it matters that HWP can be sustained by NRM; testing for HWP; and whether RM is virtual rather than real. Finally, some closing remarks are followed by references.
Deriving HWP and Properties of the Stationary State
Taking account of sex, there are nine mating combinations, to go from the parental to offspring generation, as identified by the matrix:
The mating frequencies are summarized by the matrix:
A ‘child’ who has one of the three types arises from each coupling and the aggregate of children form the new generation, later to become parents, in their turn.
Below, we impose the condition that C is symmetric, ensuring that males and females have the same frequencies, which are denoted by the vector {f 0, f 1, f 2}. These are obtained by summing the rows and columns of C.
The typical introduction to the HW model uses a tabular presentation of the kind taken from Johannsen (Reference Johannsen1913) and shown in Table 1. This can be streamlined with the use of some basic matrix algebra (Stark & Seneta, Reference Stark and Seneta2012). We feel that it is useful in clarifying key elements of the HW law, although we recognize that the approach will not please all tastes. The whole aim is to go from a parental distribution of genotype frequencies to the offspring distribution. First, we rearrange the elements of mating matrix C in a vector of nine elements:
Next, we need the matrix that defines the zygotic output from the respective mating pairs:
We refer to M as ‘Mendel's coefficients of heredity’.
The composition of the offspring generation is simply T′ = (MU)′:
If additionally to the conditions of symmetry and sum of all elements unity of C, we assume that
The usual introduction to HWE takes a form of which the following is a matrix version:
Suppose the initial population has frequencies {f0, f1, f2}, then RM is expressed by
Then, applying T′ = (M U)′, yields
If RM is continued, the mating matrix is
The frequencies among offspring from (6) are
We note that, in addition to having all non-negative elements summing to 1 and symmetry, C in (6) also has the critical property f 11 = 4f 02, that is (4), which ensures that the frequencies {f0, f1, f2} are maintained, and specifically in HW form (5).
Thus, starting from an arbitrary initial distribution of genotype proportions, stationary genotype proportions are achieved after one generation of mating, through application of two successive mating matrices C0 and C.
A general result can be achieved by taking an arbitrary initial genotypic distribution {f0, f1, f2}, and then choosing any C0 = {fij} whose row and column sums are {f0, f1, f2} to produce, using (3) after one round of mating, genotypic frequencies {f0, f1, f2}. Now choose C = {fij} to have its row and column sums {f0, f1, f2}, and additionally to satisfy (4), that is, f11=4f02. Thus, stationary genotype frequencies are achieved after one round of mating. These are not necessarily in HWP form.
In deriving the HW law, Hardy (Reference Hardy1908) pointed out that if the initial parental frequencies satisfy the relation:
However, we see that any genotype structure T′ = {f 0, f 1, f 2} satisfying (7) is already in HWP, since HWP can be expressed as $\,\{ {f_0 ,\, 2\sqrt {f_0 f_2 } ,\,f_2 }\}$ if we write f 0 = q 2 and f 2 = p 2, where p = 1−q, whether mating is RM or not. We remind the reader that p and q will not necessarily denote gene frequencies in HWP of genotypes. In the case of RM, they will denote gene frequencies; and in this case (6) reveals that (4) holds as well as (7).
HWP may describe equilibrium genotype frequencies, when (4) holds, even under NRM, as we shall show in the following sections.
Equation (4) is a condition for stationarity over time of genotype frequencies, while (7) just describes genotype frequencies in HWP. Under RM, (4) and (7) coincide.
To summarize: RM implies HWP (and then (4) and (7) hold), but it is not true that HWP imply RM (even though (4) and (7) hold).
The invalid but still ubiquitous belief that stationary HWP imply RM has developed and hardened since Hardy's (Reference Hardy1908) paper, which toward its end has the misleading statement: ‘Hypotheses other than that of purely random mating will give different results . . .’ (p. 49). This has morphed, for example, in Cavalli-Sforza and Bodmer (Reference Cavalli-Sforza and Bodmer1971) into ‘If the probabilities of a given mating are different from those expected under random mating, the expected frequencies of genotypes do not follow the Hardy–Weinberg law’ (p. 537).
Our aim in this article is to demonstrate that stationary HWP over time do not imply RM, and indeed, more realistically, may quite plausibly result from NRM.
Example of an NRM Path to HW Form and HWE With NRM
The textbooks show how HWP are reached in one round of RM. Stark (Reference Stark2006b, Reference Stark2007) shows that HWP can be produced from an arbitrary parental population with a single round of NRM. For example, mating matrix C0 in (8), converted to U by (2), and then applied by T = MU gives HWP {0.16,0.48,0.36} with q = 0.4, from parental population {0.2,0.4,0.4}:
The next point that we want to stress is that (5) can be sustained by NRM, if the mating regime satisfies (4). This allows considerable scope to choose C. For example, taking q = 0.4 to give HWP {0.16, 0.48, 0.36}, put f00 = 0.0352 (supposing ${\bf q} \le {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}$), then f01 can be chosen arbitrarily, using the single remaining degree of freedom available, say, f01 = 0.0576. Then, f02 = f0 − (f00 + f01) = 0.0672, f10 = f01, by symmetry, f11 = 4f02 from (4), f12 = f1 − (f10 + f11), f20 = f02, and f21 = f12 by symmetry, and f22 = f2 − (f20 + f21), as shown in the following NRM matrix:
Finally, notice that the NRM mating matrix
The main point that emerges from this and the previous section is that appeal to identity (4), which relates to the mating matrix, rather than to identity (7), which relates to the distribution among offspring following RM, reveals the fact that HWE can be sustained by NRM.
The preceding section shows how a mating system embodied in (4) maintains a stationary state and a population in HW form is just one such state. Mating regimes with particular characteristics can be set up to conform with this structure.
A Generator of HWP With NRM
We start by assuming that the population is in HW form and take (Stark, Reference Stark2005) the mating matrix (1) to be
Note that f 11 = (f 1)2(1 + ν) = 4p 2q 2(1 + ν) and f 02 = f 0f 2(1 + ν) = p 2q 2(1 + ν) and so satisfies (4), so that HWP are maintained, but mating is NRM unless ν = 0. In (9), ν can be any point in the interval ( − q 2/p 2, q/p), provided, without loss of generality, that $q \le {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}$ (Stark, Reference Stark2005).
A General Mating Model That Maintains Genetic Variability
In this model, mating frequencies are specified by
In (10), we make use of the fact that any trio of genotypic frequencies can be expressed in terms of a gene frequency q and a deviation from HWP, here either F or G. First, choose s and t positive, and large enough to make all entries of (10) positive. Summing the elements of (10) by rows and columns shows that the parental genotypic frequencies are:
The distribution of genotypes among offspring is, according to (3):
Having chosen q and F to define the composition of the initial population, thus a judicious choice of parameters s and t can be made to produce the composition of the next generation defined by the same q and arbitrary G. In particular, HWP can be produced in the distribution of offspring by taking G = 0, by NRM, if required (Stark, Reference Stark2007).
Another model that produces HWP among offspring from an arbitrary parental population is given by Stark (Reference Stark2006b).
NRM Maintaining HWP: Does It Matter?
The best-known early instance of an NRM mating model is due to Li (Reference Li1988), whose mating matrix is
Note that the population specified by (11) is in HWE, since (4) is satisfied. Stark (Reference Stark2006b) followed up on this NRM direction by producing a model, which shows that HWP can be produced in a single round of NRM.
Weinberg (Reference Weinberg1908) in his original paper used the RM version of the mating matrix (11), that is, the case where a = b = 0, in looking for detectable genetic influence on the rate of twinning in humans, comparing two models, dominant and recessive inheritance. Stark (Reference Stark2006a), in the final section of his paper, describes Weinberg's procedure and generalizes it using (11), to make the calculated hypothetical twinning rates depend on the values of a, b. Weinberg's procedure consists in comparing the hypothetical with the observed twinning rates, so the assumption of NRM will explicitly affect such a comparison.
In this sense, Weinberg's (Reference Weinberg1908) original paper provides a positive response to the question in the above subsection heading.
Bulmer (Reference Bulmer1970) is an indispensable source of information on, and analysis of, the inheritance of twinning in man and devotes a whole chapter to it. He says: ‘The first, and the most extensive, data on this subject were obtained by Weinberg from family registers in Stuttgart at the beginning of this century’ (Bulmer, Reference Bulmer1970, p. 114). Bulmer has citations to Weinberg's research, published in 1901, 1909, and 1928. Curiously, he does not refer anywhere to Weinberg's celebrated paper of 1908, mentioned above. But in the main, he agrees with Weinberg's conclusions about the inheritance of twinning.
Do Statistical Procedures Test for RM or for HWP?
Test criteria for ‘RM’ are usually based on a random sample of size of counts, say {n 0, n 1, n 2}, n in all, of the respective genotypes {AA, AB, BB}. In reality, such criteria merely test the hypothesis that the structure of the population in genotypic equilibrium has genotypic structure in HWP: {q 2, 2pq, p 2}, where p = 1 − q , for some q, 0 < q < 1. Thus, consistency of data with hypothesis does not preclude NRM, as our above discussion shows.
We justify our statement about test criteria by adapting the presentation of the standard genetic textbook test procedure presented in Li (Reference Li1955, Chapter 2, Section 1). At the heart of this test is that each of the n individually chosen individuals for the sample will fall into one of three classes, according to genotype, with the numbers in the three classes being {n 0, n 1, n 2}. The joint distribution of the numbers {n 0, n 1, n 2} is multinomial with probabilities of falling into the respective classes being {p 0, p 1, p 2}. Under the null hypothesis being tested, the probabilities of falling into the respective classes are {p 0, p 1, p 2} = {q 2, 2pq, p 2}.
Here, q is unknown and must be estimated from the data. Asymptotic hypothesis testing theory dictates that we substitute the maximum likelihood estimator $\hat q$ (calculated from the sample) of q into {q 2, 2pq, p 2}, to enable us to calculate the expected values corresponding to each class assuming the null hypothesis is true.
The likelihood function for which the maximizing value of q is to be found is
Setting
The illusion that the test is one for RM arises since $\hat q$ is the proportion of A alleles in the sample, so q 2 is to be estimated by $\hat q^2$. This gives the impression that the mating probabilities come about from random union of gametes. But the nature of $\hat q$ is just a mathematical artifact, and cannot be used to infer such a causal reason for the structure {q 2, 2pq, p 2} of genotype frequencies in the population.
It is thus one thing to say that such and such data are not significantly different from HWP but quite another to extend this to the conclusion that the relevant population is practising RM.
The above test, a slight variant of Karl Pearson's χ2 goodness of fit test, is asymptotic: that is, it is a large sample size test, and so needs n to be large.
Haldane (Reference Haldane1954) based his own ‘exact test’, that is, for not necessarily large n, on the characteristic of HWE expressed by the identity, in our notation:
By a complex calculation, Haldane shows that the quantity 4n 0n 2 − n 1(n 1 − 1) has expectation zero leading to a test criterion D, which he calls a measure of divergence, defined by
To test D, he needs its sampling variance, which after more calculation he finds to be
The argument gets even more complicated when it comes to using D and its variance in practical settings. Haldane gives an example using data from a population of the scarlet tiger moth Panaxia dominula; n 0 is the number of the homozygous type dominula, n 1 the number of the heterozygote medionigra, and n 2 the number of the homozygous bimacula. Haldane concludes ‘. . . there is no evidence for a systematic tendency . . . from random mating’ (p. 634).
Mayo (Reference Mayo2008) gives a wide-ranging discussion of applications of the HW principle, such as Haldane's example.
So, Is Random Mating Virtual Rather Than Real?
We are clearly inclined to answer in the affirmative to the question posed by the heading of this section, despite statements like the following by Hartl and Jones (Reference Hartl and Jones2006, p. 503), taken from their introduction to the Hardy–Weinberg law:
When a local population undergoes random mating, it means that organisms in the local population form mating pairs independently of genotype. Each type of mating pair is formed as often as would be expected by chance encounters. Random mating is by far the most prevalent mating system for most species . . . .
They acknowledge that this cannot be true of self-fertilizing plants.
Our preceding discussion requires a comment on the biological significance of the condition (4), which appears in our examples of NRM systems in which, nevertheless, HWE is maintained. This condition on the mating matrix is required for the stationarity of genotypic frequencies over time, whether these frequencies are in HWP or not. It always holds under RM, in which case it is just an expression of HWP. Experimental verification of (4) and HWE would therefore not preclude RM.
The biological issue is, rather: how, under HWE of observed genotypes (established, say, by the test in our previous section), can one test for NRM? Given the HWP, so (7) is satisfied, and observed mating frequencies fij, a test could be devised for the null hypothesis H 0:fij = fifj, i,j = 0,1,2, that is, that mating is random. If on the basis of the observed mating frequencies, H 0 is rejected, we would have evidence in support of HWE being maintained by NRM. In the event, we would expect both (4) and (7) to hold.
The key issue here is the availability of experimental observations on mating frequencies, fij.
Some small progress in this practical direction has been made.
Leach and Mayo (Reference Leach and Mayo2005) point out that in some forest trees there appears to be quite a high level of inbreeding, which nonetheless might yield HWE. Neel et al. (Reference Neel, Salzano, Junqueira, Keiter and Maybury-Lewis1964) and Fraser et al. (Reference Fraser, Steinberg, Defaranas, Mayo, Stamatoyannopoulos and Motulsky1969) have discussed the remarkable presence of HWE in situations where populations contravene the usual postulates under which equilibrium might be expected.
Sebro et al. (Reference Sebro, Hoffmann, Lange, Rogus and Risch2010) acknowledge right from the start that a population that they are studying may be structured. They set up the following model: ‘Consider a stratified population comprised of G separate subpopulations, where G, as well as the actual members of each subpopulation, are unknown . . . . We assume that there is random mating and HWE within each subpopulation, but no mating between subpopulations’ (p. 674). The first part of the title of their paper is ‘Testing for non-random mating’. Having found evidence of ‘ancestrally related positive assortative mating’, the final sentence of the abstract says: ‘This non-random mating likely affects genetic structure seen more generally in the North American population of European descent today, and decreases the rate of decay of linkage disequilibrium for ancestrally informative markers’ (p. 674).
The analysis of Sebro et al. requires that they calculate the mating-type frequencies [our italics] in the presence of population stratification. They develop a method and suggest that it is simpler than that of Yasuda (Reference Yasuda1968). Stark (Reference Stark2008) gives mating frequencies for a model that has similar characteristics to that of Sebro et al., but like Yasuda’s, involves third and fourth moments of the distribution of gene frequencies over the structured population, so is rather complicated. The lesson from the paper of Sebro et al. is that researchers may have to ‘bite the bullet’ of population stratification if they want to make realistic analyses. Differences in gene frequencies, sometimes quite large, over ethnic groups, have been reported many times.
Closing Remarks
Identities (4) and (7) have been shown to be crucial to the understanding of HWE. Experience has shown that HWP are a satisfactory basis for interpretation in many situations, but there is less justification for appeal to RM as an explanation.
Various experts have written more generally about HW. Weir (Reference Weir1996) states ‘neutrality is one of the sufficient conditions for HWE, but not a necessary condition. Li (Reference Li1988) showed that HWE proportions can also be found for various non-random mating situations’ (p. 276). He does not pursue this further.
What seems to us as nearly a metaphor for our intentions is from Orr (Reference Orr1966):
We evolutionists have a long track record of preferring fancy over simple theories, dating from our infamous reluctance to surrender GALTON in the face of Mendelism. Surely something as déclassé as 3:1 ratios was not to be preferred to GALTON's sophisticated and seductive mathematics. (p. 1333)
But we leave the last word to Mayo (Reference Mayo2008):
Li (Reference Li1988), followed and elaborated by Stark (Reference Stark2006a, Reference Stark2006b), showed that panmixia is not the only breeding structure that can yield HW proportions, so that panmixia is a sufficient but not a necessary condition for HWE. However, no natural population is known to manifest the other possible breeding structures so that it appears unlikely that they need to be considered in data collection and analysis. HWE continues to be an important starting point for any population analysis. This will indeed be true even when what is being analyzed is something that must initially disrupt the regularity of the meiotic processes that provide the basis for HWE . . . . (p. 253)
We leave it to the reader to judge.
Acknowledgments
We are grateful for a referee's constructive comments, and for pointing out a number of lines of research to be followed. More specifically, if Hardy–Weinberg proportions are maintained in an actual situation, but random mating is an unrealistic assumption, how does one account for the observations?