Whether they believe in IQ or not, most people sense that individual differences in intelligence are substantial and at least partly ‘genetic’. The nature–nurture debate about the origins of such differences goes back a long way; at least as far as the philosopher-scientists of Ancient Greece. And most people have probably adopted common-sense views about it for just as long. It is evident today in popular cliches: our genetic blueprints set levels of potential, while nurture determines how much of it is reached; individual differences result from both genes and environments; genes and environments interact to determine individual differences; and so on.
Most of this chapter is about how biologists and psychologists have studied the origins of such differences. I hope to show that reaching out to biology has been far more concerned with vindicating a particular view of individual differences than what intelligence really is. In consequence, intelligence has been researched biologically rather as the agriculturalist studies crop growing, or the animal breeder seeks to boost specific traits. Individual differences have been treated as variation in a physical ‘strength’ or ‘power’, like the speed of greyhounds or the spring of thoroughbred hurdlers. It has left us with narrow views of both genes and environments, and little understanding or agreement about what intelligence really is.
At the present time, there is something almost evangelical about the new faith in genes. For several decades many IQ testers have called themselves ‘behaviour geneticists’. Now they’re swinging DNA at us like a demolition ball, and creating a frenzy in the popular media: ‘Being rich and successful is in your DNA’ (Guardian, 12 July 2018); ‘A new genetic test could help determine children’s success’ (Newsweek, 10 July 2018); ‘Our fortune telling genes’ predict our future intelligence (several places); and so on.
There is little modesty about it. Authors marshal every hyperbole: breakthrough, game-changer, exciting advances, and so on. Promises of a ‘new genetics’ of intelligence are thrown like confetti. Critics are condemned as ‘gene deniers’, ‘science-deniers’, or even (in a further hint of ideology) ‘race-deniers’, while the message has caught fund-raisers’ eyes, pervaded institutions, and reached government advisers and prime ministers.
Why the exuberance? ‘Mountains of data’, say the IQ champions, from lots of new studies. What that means, in fact, is mountains of correlations, from questionable measures using statistical models based on unlikely assumptions.
In this chapter I look at the real nature of those data and the statistical models through which they are filtered and formed. They are debatable, but they are also instructive about the understanding of intelligence underlying them. In Chapter 3 I will lay the foundations of a real theory of real intelligence. Its many ramifications fill the rest of the book.
An Agricultural Model
That ‘something’ is transmitted from parents to offspring, that can make differences in us – in particular traits – is obvious. There is also some predictability about it. Farmers and animal and crop breeders have exploited this fact for millennia, and we have much to thank them for. The apparent transmission is not straightforward, like the social inheritance of wealth. Offspring bear both similarities and differences to parents, and each other. She may have her mother’s nose, but maybe not her mother’s eyes or mouth, rather her father’s – or even features different from both parents. We settle for terms of approximation like ‘family resemblance’. Sometimes the effects of environment on some traits, as with nutrition and exercise on growth, are equally obvious.
The question for farmers and breeders – and what has also dominated the subject of human intelligence – was how to distinguish effects of different genes from those of different environments. In agricultural circles, in the early twentieth century, the issue had become a pressing economic one, and became called ‘heritability’. Suppose variation in milk yield between cows is largely associated with genetic variation (there is a high ‘heritability’). Then, it was argued, selecting for breeding those individuals who already exhibit high milk yield may boost the average yield in offspring. If the variation is estimated to be mainly due to environmental variation (there is low heritability), selective breeding will not make much difference.
The solution to the issue was proposed by statistician Sir Ronald Fisher in a paper in 1918. We cannot actually ‘see’ the genes, but he argued that their effects may be inferred from the patterns of resemblances among known relatives. That seems logical enough, but such patterns can also come from environmental similarities and differences. How do we distinguish between them? Fisher’s proposal was exceedingly bold because of the assumptions entailed.
Experiments by Gregor Mendel (discovered around 1900) later led to the realisation that genes come in pairs, one from each parent. These (the members of each pair) are called ‘alleles’ and, as received from parents, may be the same or different. Fisher’s solution was basically as follows. Let’s pretend that the alleles are equal in effect on variation, positive or negative. Also assume that individual differences are due to small effects of many genes and their respective alleles. Then, he suggested, the total ‘genetic’ effect is simply all those independent effects added together. That means assuming that genes vary like little electrical charges, positive or negative. Individual differences in a trait then reflect different sums of those charges (Figure 2.1).
That makes a very simple picture. The ‘strength’ of an individual’s intelligence, say, lies in the particular permutation of strong or weak alleles (at least of those alleles that make a difference). So behaviour geneticists speak of ‘intelligence-enhancing alleles’ and ‘intelligence-depleting alleles’. More importantly, the picture can be used to determine heritability: how much different permutations are associated with differences in the trait. We now know that members of any species (humans or other animals) share the vast majority of their genes and their alleles. And most of those that vary don’t make a difference to the trait anyway. What matters is those few that do and how much difference they make. That’s heritability, usually expressed as a ratio between 0.0 and 1.0, or 0–100 per cent.
Of course, at that time, Fisher could not ‘see’ the genes, nor the degree to which individuals share them. But it could be estimated by comparing known relatives. In crops and animals, the degree of genetic sharing (or correlation) is known from their pedigrees, or breeding histories. In humans it is known from family relations. It was already known that identical twins (monozygotic, or MZ) come from one egg. So each pair can be presumed to share all their genes. If trait variation in a population is only due to different genes, MZ pairs aren’t expected to differ in any of them. That is, correlations in trait measures should be around 1.0. Non-identical twins (dizygotic, or DZ) develop from two eggs. In consequence, they will vary, on average, in only half of those genes, so the trait correlation should be around 0.5. The same applies to parent–offspring pairs, and to pairs of non-twin siblings.
In summary, Fisher argued that the total genetic effect on variation – the heritability – can be estimated from genetic correlations among relatives. To be clear, if the estimated heritability for any trait is 1.0 (or 100 per cent), all the trait variation is associated with genetic variation. Anything less than 1.0 is attributed to environmental effects. Fisher himself was quite sure that ‘figures for human measurements shows that there is little or no indication of non-genetic causes’.
In that way an agricultural model of individual differences descended on the understanding of humanity, including (as we shall see) our understanding of intelligence. It all sounds highly plausible, if not brilliant. But, even then, Fisher noted possible complications. One of those was the assumption that genes exist rather as autonomous agents, dictating their difference-making effects. Another is the possibility that the gene effects might not just add together in that neat way. Effects of one gene might have effects on others (called gene–gene interaction). Also, their effects may be different in different environments (called gene–environment interactions).
The other complication is that trait differences are also affected by different environments. That’s not a problem with cows in uniform fields, wheat in field plots, and so on. In those cases, effects of any environmental differences are assumed to be spread randomly across individuals; they don’t sway the estimates one way or the other. But it’s not so easy for humans in socially structured environments.
Fisher’s heritability estimations did turn out to be useful in agricultural settings. More crucially, they laid the foundations of ‘quantitative’ genetics that swept biology and psychology. This still dominates today’s thinking about sources of individual differences and how to estimate them, including recent DNA sequencing efforts. In the human context it has been epoch-making in drawing conclusions about nature and nurture, even guiding social and educational policy, as mentioned in Chapter 1.
Before taking you further down that highway, and seeing the busy traffic on it, I want to warn you of some more bumps. Readers should be aware that Fisher’s model is still what passes for the ‘genetics’ of intelligence today. More recent research, however, has shown the foundations of the model – Fisher’s assumptions – to be quite unsound, especially with more complex traits and functions. To give you some flavour of those doubts, here are some recent assessments by today’s geneticists.
[A]lthough quantitative genetics has proved highly successful in plant and animal breeding, it should be remembered that this success has been based on large pedigrees, well-controlled environments, and short-term prediction. When these methods have been applied to natural populations, even the most basic predictions fail, in large part due to poorly understood environmental factors … Once we leave an experimental setting, we are effectively skating on thin ice.
Quantitative genetics traces its roots back through more than a century of theory, largely formed in the absence of directly observable genotype data, and has remained essentially unchanged for decades … the available molecular evidence indicates that biological systems are anything but additive and that we need to evaluate alternative ways of utilizing the new data to understand the function of the genome.
Petter Portin and Adam Wilkins, in 2017, said that recent developments in genetics ‘raise questions about both the utility of the concept of a basic “unit of inheritance” and the long implicit belief that genes are autonomous agents’. They noted that ‘the classic molecular definition [is] obsolete’, and that we need ‘a new one based on contemporary knowledge’.
Significantly, too, reflecting on it all in 1951, Fisher himself questioned ‘the so-called co-efficient [i.e. measure] of heritability, which I regard as one of those unfortunate short-cuts, which have often emerged in biometry for lack of a more thorough analysis of the data’.
Note that such realities – especially the now well-known interactions between gene products, and between them and different environments – compromise the very idea of heritability in complex traits. In particular, they indicate how ‘heritability’ is not a fixed aspect of individuals or of populations. But I will return to that later. Unfortunately, the idea has dominated the genetic study of human intelligence, and it still does. The IQ test was built around it. It pervades people’s minds and popular culture about the nature–nurture effect on intelligence. And nearly all research in the genetics of IQ has consisted of a grim spinning out of its flawed preconceptions. What follows is a brief narrative on the methodological contortions needed to prop it up.
Cyril Burt’s Twin Correlations
It was the determined educational psychologist Cyril Burt who brought Fisher’s ‘solution’ into the domain of human mental abilities. Burt had already been active in IQ testing and was our ‘adviser’ on that game-changing education report mentioned in Chapter 1. He worked hard to apply Fisher’s model to human intelligence, where control of environmental effects is not so easy. The simplest approach, he realised, would be to compare identical (MZ) twins who had been reared apart. By not sharing the same homes, neighbourhoods, and so on, that should eliminate – or so it seemed – any correlation between them due to environmental effects. The average correlation between such pairs of twins could, in theory, be a direct estimate of heritability.
In an influential set of papers in the 1950s, Burt claimed to have done just that and to have measured the IQs of twins reared apart. He arrived at an estimate of the heritability of IQ of 0.83. This means that, according to Burt, 83 per cent of individual differences in IQ is associated with genetic variation; only 17 per cent is associated with differences in their experiences.
Here, however, things get murky. Separated identical twins are relatively rare, and Burt seems to have been suspiciously lucky in finding so many. Starting with 15 in 1943, he added 6 pairs for a total of 21 in 1955, finishing with a total of 53 pairs in 1966. Strangely enough, the correlations he reported across all three sets remained exactly the same at 0.77. When psychologist Leon Kamin checked Burt’s data and uncovered this anomaly in 1971, it was soon being described as ‘The most sensational charge of scientific fraud this century.’ Burt, by then, was dead and so unable to answer the charge. But his attempts to prove his strong hereditarian views are no longer recognised, even by supporters.
Other Twins Reared Apart
The Minnesota Study of Twins Reared Apart (MISTRA) was started in the late 1970s by psychologist Thomas Bouchard and colleagues. It ended with 81 pairs. The results, reported in prestigious journals and books, along with sensationalised magazine stories, shook scientific and general populations around the world. Bouchard and colleagues consistently reported high average correlations between pairs of twins in IQ and other tests, even though reared apart. That proves, they claimed, a heritability for IQ of 0.75. In 1993, psychologist Robert Plomin wrote about ‘the powerful design of comparing twins reared apart’, suggesting ‘a heritability for g of 80%’. For psychologists, the results challenged some cherished views about human development. They inspired the controversial theory of Arthur Jensen and others on the causes of ‘racial’ and social class differences in intelligence.
However, there are many dubious aspects of these data, too. They have been subjected to forensic dissection by investigators like Leon Kamin and Jay Joseph. I focus on a few of the main problems, just to illustrate the scientific latitude common to this field.
Really Reared Apart?
The first and biggest problem concerns whether those twins were truly separated. The twins obviously shared an environment in the womb before birth. That can have lasting effects on development throughout life. But even after birth, this was no rigorous research design. Authors of the MISTRA study have said that arrangements for separation ‘were sometimes informal’, and that at least some of the pairs had spent as much as four years together before being separated. Many of the twins were simply brought up by grandparents, aunts, or cousins. And at least some of them had spent considerable amounts of time together. Further analyses reveal how the reports stretch the usual meaning of the word ‘separated’ beyond credulity.
There is also bias in other ways. In the age of opinion polls, we all know the importance of samples being representative. Twins generally tend to be self-selecting in any twin study. Many of the MISTRA twins were recruited through media appeals. Others were prompted to volunteer by friends or family, on the grounds that they were alike. And some knew each other prior to the study. As Jay Joseph has suggested, twins that volunteer may well be more similar than the average.
There have been other problems. They include inadequate or incomplete reporting of important details, and disallowing other investigators access to data (a fundamental tenet of research). I recommend Jay Joseph’s famous blog for an exhaustive expose of these problems.
‘Classical’ Twin Studies
Separated twins are rare, so the main approach has been to compare pairs of MZ twins with pairs of DZ twins reared together. The degree to which their resemblances correspond with genetic resemblances (i.e. 1.0 versus 0.5) has also been used as an index of heritability. This is the ‘classical twin method’. Since the 1930s, it has been the workhorse of those who want to estimate the heritability of IQ. There have been dozens of such studies and a consistent pattern of correlations has been reported for IQ: usually higher for MZ twins than for DZ twins. Some have combined correlational data into more complex statistical models, but make the same assumptions and end up saying much the same thing. The estimated heritability is reported to be around 0.5 – meaning 50 per cent of differences in IQ is associated with genetic differences.
Those are the results producing fanfares about genes and IQ, almost always inferring causal connections, and much hyperbole. ‘The genetic contribution is not just statistically significant’, says Plomin, ‘it is massive. Genetics is the most important factor shaping who we are. It explains more of the psychological differences between us than everything else put together.’ Other leading authors have followed in the wake. ‘They show’, says Kevin Mitchell in Innate, ‘that genetic differences contribute substantially to individual differences in psychological traits’. In his book Human Diversity, Charles Murray says, ‘no one who accepts the validity of twin studies finds reasons to dispute them’. Surely, the chorus enjoins, so many studies cannot be wrong?
More False Assumptions
But they are wrong, critics say: so wrong that the classical twin study can be described as one of the most misleading methods in the history of science. The main problem is this. Simply taking the difference between MZ and DZ correlations as an index of genetic effects makes a big assumption. This is that the environments experienced by MZ pairs are no more similar during development than for DZ pairs. The higher correlation between them could be due to that. It is called the equal environments assumption (EEA).
The assumption is demonstrably false, and the problem insurmountable. There is overwhelming evidence that parents, friends, and teachers treat MZ twins more similarly than DZ twins. MZ twins are more likely to dress alike, share bedrooms, friends, activities, and so on. In his major review of the EEA, Jay Joseph cites dozens of findings revealing very large differences between MZ and DZ pairs in experiences such as identity confusion, being brought up as a unit, being inseparable as children, and having an extremely strong level of closeness. It is also known that parents hold more similar expectations for their MZ than DZ twins. ‘[T]win researchers and their critics don’t have to argue anymore about whether MZ and DZ environments are different, since almost everyone now agrees that they are different’, Joseph says.
These effects will, of course, contribute to correlations between MZ twins. What is assumed to be ‘genetic’ may, in fact, be ‘environmental’ in origin. Some investigators claim to have assessed the importance of those treatment effects. But how can they? They would first need to know what the ‘trait relevant’ environments are. And they don’t know that because they don’t know what intelligence really is, nor how it develops. So the problem remains. Those who continue with twin studies just tend to ignore it.
But that’s not the only problem. Usually overlooked is that parents often do the opposite with DZ twins. They tend to exaggerate any differences, leading to stereotypes taken up by family and friends, and ‘lived up to’ by the twins themselves. More crucially, many DZ twins consciously strive to make themselves different from one another. Studies show how they cooperate less and exhibit more competition or rivalry. DZ twins also tend to be physically more distinct than MZ twins, and it is known that appearances affect others’ perceptions and reactions. Moreover, development of polarised identities can mean different interests and activities that can influence schoolwork, and IQ test performance.
Such effects are clearly evident in published twin correlations, even if they’re brushed aside. What is being reported as ‘genetic’, with high heritability, can be explained by difference-making interactions between real people. In other words, parents and children are sensitive, reactive, living beings, not hollow mechanical or statistical units. The consequences are likely to pervade all twin studies. But, as ever, they can grossly mislead. For example, behaviour geneticists, using the simple agricultural model, have been unable to explain even why children in the same family are so different from one another. Furthermore, these difference-making effects are likely to increase with age. In his book Blueprint (and many other places), Robert Plomin makes a big deal of this, declaring it as ‘one of the big findings from behavioural genetic research … genetic influences become more important as we grow older’. The findings are, however, more likely to be illusions from poor methodology and its unlikely assumptions.
In sum, there are so many unlikely assumptions and biased data interpretations entailed in twin studies as to render them unsuitable for scientific conclusion. But there is one other aspect of the research that warrants mention here.
Make-Do Data
Unfortunately, trying to describe the genetics of intelligence using the Fisher/Burt model of genetics has also encouraged a make-do research culture. As Plomin’s team put it about their Twins Early Development Study: ‘In-person cognitive testing of the large, geographically dispersed TEDS sample has not been feasible.’ Instead, the study used a mixture of shortened forms of tests, administered by parents in the home using posted booklets, testing by telephone, and via the internet.
Other twin studies have been just as bad, if not worse, in that respect. It is no use saying that these have been ‘validated’ through correlation with other tests: those tests have questionable validity, too. The problem is that the less controlled the testing, the more correlations can be explained by non-cognitive factors, or other errors, and the more open they are to misinterpretation.
Adopted Children’s IQs
A lesser, but still much cited, device of behaviour geneticists has been the study of adopted children. As with twin studies, the logic is beguilingly simple. It consists of comparing the IQs of adopted children with IQs of both adopted parents and their biological parents. As the theory goes, adopted children share nature with their biological parents, but nurture only with their adoptive parents, so the effects of the two can be disentangled. What a brilliant idea, you might think. Indeed, results have shown some consistency. Adopted children tend to resemble their biological parents more than their adoptive parents. The correlations are mostly very small on either side (around 0.2–0.3), and they vary from study to study and test to test. But the pattern is there.
Psychologists say they’re amazed by this proof of the role of genetics. But interpretation is not so straightforward. The idealistic view of the design, based on the agricultural model, is one problem. As developmental psychologist Jacquelyne Faye Jackson put it in the journal Child Development, the model is ‘engaging because of its simplicity. However, it is critically incomplete as a model of what actually happens in family life.’
Within families there can be many uncontrolled effects – conscious or unconscious – that lead to adoptees being treated differently from other family members. Some adoptive parents worry about the personalities, ‘bloodlines’, and social histories of the child’s biological parents, and how that might affect them. Adoptive parents also tend to hold stronger beliefs in the influence of heredity compared with non-adoptive parents. Incredibly, in some studies, prior contact between biological and adoptive families occurred. As Sandra Scarr and colleagues put it in a 1980 article, ‘Adoptive parents, knowing that there is no genetic link between them and their children, may expect less similarity and thus not pressure their children to become like their parents.’ Other parents may make deliberate efforts to enhance the differences. Again, we are dealing with reactive people, in dynamic contexts, not the passive circumstances the researchers assume.
There are also many reasons why IQs between adopted children and their biological parents can correlate. One of these is that placement of children for adoption has rarely been random. From their knowledge of the biological parents, adoption agencies tend to have preconceived ideas about the future intelligence of the child, and place him or her in what they think will be a compatible family environment. There is strong evidence of such ‘selective placement’ in adoption studies for IQ. Even within the crude model, what is environmental is again being read as ‘genetic’.
Most obviously, adopted children spend the first formative period of their lives in their biological mothers’ wombs. We now know of many ways in which that experience (even stress experienced by the mother before pregnancy) can affect both mother and child throughout life. That includes, for example, general vitality and reactions to stress, including test-taking. In addition, there is continuing physical resemblance. Much research has shown that appearances, such as facial attractiveness or height, influence how individuals become treated by teachers and other people. The similarity between biological parents and their children in such respects may mean similar effects for their self-esteem, confidence, learning aspirations, and readiness for IQ test-taking. Again, that may lead to the (weak) correlations reported.
One pervasive finding of adoption studies is that adopted children end up actually resembling their adoptive parents in average IQ far more than their biological parents. Big increases of up to 15 IQ points have been reported, compared with children not adopted (and, therefore continuing in the original social class background). That is a salutary reminder of the misleading power of the correlation coefficient.
Unfortunately, adoption studies, too, have been plagued with a make-do testing culture, with a disparate assortment of tests and procedures passing for ‘intelligence’ tests. In one recent study, for example, there was no test available at all for the adopted children, so they simply used ‘years of education’ as a surrogate. Again, this is important because it opens up resulting correlations to all the uncontrolled factors mentioned above. But this is merely a brief summary of the problems with twin and adoption studies. Attempts to use the essentially agricultural model continue to this day, in full knowledge of its false assumptions. But they have reached new heights of genetic illusion with the recent DNA sequencing research.
DNA: The Genie Out of the Bottle?
In twin and adoption studies researchers cannot actually ‘see’ the genes whose effects they claim to be measuring. The research is based on statistical models, not direct reality. That changed when genes became identified with DNA that could be extracted from cells and analysed in the laboratory.
DNA consists of long sequences of molecules called nucleotides. They differ in that each contains one of the following chemical components: adenine (A); thymine (T); cytosine (C); and guanine (G). The sequences or strands are in pairs (the famous double-helix) such that A matches to T and C matches to G. Each gene consists of many thousands of these, with different genes comprising different sequences. The different sequences are used as templates for the assembly of different strings of amino acids. It’s those strings that make up the different proteins in cells (actually through a ‘handing over’ of the sequence through an intermediate, called messenger RNA).
Nearly all – at least 99 per cent – of our genes are identical from individual to individual. But occasionally one of the nucleotides in the sequence has been replaced with another. Instead of the same nucleotides at that location in the population we get a ‘single nucleotide polymorphism’, or SNP. Different SNPs might mean different proteins, and altered function, in different individuals, although the vast majority of these genetic variants have no known phenotypic consequence; they are ‘neutral’. Nevertheless, it’s those SNP differences, and their possible association with IQ differences, that have been the focus of attention (Figure 2.2). Brilliant advances in molecular biology have made it possible to describe versions of SNPs that different individuals actually have. Procedures have developed into an industrial-scale enterprise done by machines and computers at rapid speed and rapidly reducing cost. And it only requires a drop of blood or a mouth swab. This is what the Human Genome Project has been about.
Immediately it occurred to behaviour geneticists that the limitations of twin and adoption studies could at last be overcome. All we have to do is measure people’s IQs, get a molecular biologist to sequence for SNPs, and so discover any associations between them. The deep faith has been that this will identify ‘genes for intelligence’, and who does or does not have them. Again, reports were soon being laced with terms like ‘exciting’, ‘breath-taking’, and ‘momentous shift’, with Robert Plomin declaring the gene ‘genie’ to be out of the bottle. Over the last two decades dozens of such genome-wide association studies (GWAS) have been conducted, all confident that the crucial associations would be found. The old problem about correlations not being causes lurks in the background, but forgotten in the mist of excitement. Instead, you will read or hear ‘linked to’ quite a lot – a term both ambiguous and suggestive.
It hasn’t happened, anyway. Statistically reliable correlations have been few and miniscule, and were not replicated in repeat investigations. To date, no gene or SNP has been reliably associated with the normal range even of IQ (overlooking the problem of correlations and causes, or knowing what IQ really measures). The disappointments were almost palpable, and some researchers talked of abandoning the chase. In his book Blueprint, Robert Plomin says, ‘I pondered retirement … My misery about these false starts had lots of company, because many other GWA studies failed to come up with replicable results. The message slowly sank in that there are hardly any associations of large effect.’
Polygenic Scores
Then another idea emerged. Individual associations may not be evident in the way expected because they’re too weak. But why not just add the strongest weak ones together until a statistically significant association with individual differences is obtained? It’s a sort of ‘never mind the quality, feel the width’ solution. Such ‘polygenic scores’ have now taken over the enterprise. And some (albeit still weak) correlations with IQ scores have been reported. Perhaps the most cited study is that of James Lee and colleagues, published in the journal Nature Genetics in 2018. They had to use extraordinary strategies (see below), but reported polygenic score associated with around 10 per cent of the variation in ‘years of education’, taken as a measure of intelligence. But, it is claimed, the strategy at least showed the ‘importance’ of genetics for IQ. That, after all, has been the aim all along.
Again it all seems beguilingly impressive. Practical implications were soon teeming out. Robert Plomin has suggested that polygenic scores should be available from all infants at birth. They could predict possible problems in school. Never mind the 11-plus: here’s the zero-plus exam. Other authors – in an echo of past eugenics – have proposed their use for embryo selection for intelligence.
Impressed by possibilities, the Joint Research Centre of the European Union has suggested that ‘Genetic data could potentially also be of interest to employers, e.g. to screen potential job candidates and to manage career trajectories.’ As they say, some laboratories already offer genetic tests to companies for such purposes. And clever marketing has seen millions of people scampering to learn their genetic horoscopes in DNA self-testing kits (crossing palms with silver in the process). So mesmerising has the story been that the flaws have been side-stepped. But all the research with GWAS/polygenic scores involves a lot of assumptions.
Or Just Another Damp Squib?
There has been an enormous counterblast against polygenic scores as the latest in a long line of snake-oil breakthroughs. DNA expert Keith Baverstock has described them as ‘fishing expeditions’, looking for correlations between SNPs and complex individual differences and hoping (without any evidence whatsoever) that they are causes. To understand these issues, just bear a few figures in mind. One of the early surprises of the Human Genome Project was the number of genes that humans possess. Initially it had been assumed that proteins do most things necessary in the body, and as there are at least 100,000 of them, so we should have that number of genes. It turns out that we have around 20,000.
Next, consider the numbers of nucleotides that make up the genes. There are at least three billion of them in the human genome (actually six billion because we get a copy from each parent). The vast majority of these – about 99 per cent – are identical from person to person (that’s also a measure of how genetically alike we are). On average, an SNP – the nucleotides that may vary, as in Figure 2.2 – occurs only once in every 300 nucleotides. But that means there are many millions of SNPs (estimates vary) in the human genome.
On the other hand, we also know that nearly all these variations are functionally neutral: it doesn’t matter which version you have, they work equally well (I explain why in Chapter 3, but have a look at the NIH website). Trying to segregate these relatively few SNPs that supposedly make a difference from those that do not is difficult enough for a well-defined medical condition. But for traits as scientifically nebulous as IQ, using only statistical correlations as evidence, the enterprise seem highly naive. And there are other problems.
First, as with twin and adoption studies, the hunt for statistical associations assumes that ‘effects’ of SNPs are independent of one another. Estimated associations between SNPs and IQ can simply be added together to make up the polygenic scores. As mentioned in the quotes above – and explained further in Chapter 3 – such independence is now known to be highly unlikely. Interactions between the products of genes, and between those and different environments, can mean that those correlations are misleading.
Second, the correlations are still very small, and their long-term consistency unknown. So, in order to ‘force’ statistically significant results, investigators have had to swell study samples to include IQs – or some approximations – from hundreds of thousands of individuals. That means ‘pooling’ data from dozens of smaller, but highly disparate, studies under different, rough-and-ready, testing regimes. I hope you get the picture. In clouds of millions of data points of doubtful accuracy, the scope for spurious associations is enormous. In a paper entitled ‘The deluge of spurious correlations in Big Data’ in 2017, mathematicians Cristian Calude and Giuseppe Longo showed that arbitrary correlations in large databases are inevitable. They appear in randomly generated data due to the size of the database, not because of anything meaningful in them.
This point was amusingly illustrated in a 2020 paper by population geneticist Andrew Kern. From governmental statistics he tabulated the gross domestic products of 10 European countries plus the USA. He then computed a polygenic score for a common roadside weed that grows in all of them. He found a significant correlation between the weeds’ polygenic scores and GDP. Yes – apparently the weeds’ genes predict a country’s GDP.
Kern illustrated how such spurious correlations are also due to what is called ‘populations stratification’. In common weeds it arises (among other things) from what is called ‘genetic drift’. Variations in genes (and SNPs) irrelevant to survival just randomly become more frequent in some (sub)populations than others. In humans it has become prominent for other reasons. All modern societies have arisen from centuries of immigrant streams. Different streams probably carry different sets of such (benign) SNPs. They have also tended to disperse unevenly across different social classes. But different social classes experience differences in learning opportunities, and much about the design of IQ tests, education, and so on, reflects those differences, irrespective of true abilities. Again, the scope for meaningless correlations is obvious and enormous.
Through cultural affiliations and marriage, such spurious associations can persist across many generations. A parallel is found in human surnames, which are inherited with genes. Studies of English surnames find that social class differences established in Norman times (eleventh and twelfth centuries AD) have persisted over as many as 20–30 generations. Even today, individuals bearing elite surnames from Norman and medieval times remain over-represented in the wealthier and better-educated classes in Britain. That may well mean, by the way, that your surname also predicts your IQ (and much else) without the need for hugely expensive polygenic scores.
It has become clear, in other words, that polygenic scores are prone to such distortions. And attempts to correct for them statistically are inadequate. That is evident from looking at polygenic scores within families. There, we would expect the effects of population stratification to be reduced (although not eliminated). When that is done, the small associations between polygenic scores and IQs previously reported disappear. Unsurprisingly, then, it has been found that polygenic scores for prediction of educational ability or IQ have such little predictability as to be highly questionable.
Just like IQ tests, polygenic scores may have convincing ‘face’ validity. But as epidemiologist Cecil Janssens pointed out in a 2019 critique in Human Molecular Genetics, they have no ‘construct’ validity. So we have the amazing spectacle of scientists using a measure of who knows what to seek (miniscule) correlations with another measure of who knows what at the cost of millions of pounds of public money.
Precision Science?
Yet again, we need to be aware of the make-do testing culture in this area. Investigators talk about ‘IQ’ and ‘intelligence’ testing. But the reality is a parody of precision science. For example, much of the recent research has used the UK Biobank, which is a dataset from about half a million participants who volunteered their blood for gene sequencing. ‘Fluid intelligence’ of individuals was assessed in a two-minute, time-limited, 13-item test. It was administered in a reception centre on a computer or on a web-based application at a later ‘moment’ at home (see ukbiobank.ac.uk, Category 100027). In others the pretence of testing has been abandoned altogether, and ‘years of education’ adopted as a surrogate for intelligence. I suspect that pet owners would not accept a measure of their dogs’ intelligence so easily.
In this chapter I have tried to provide a glimpse of the methodological contortions employed in trying to sustain a particular nature–nurture picture. We have covered a lot of ground, but that reflects the determination of the intelligence gene hunters devoted to the agricultural model. Whatever they are busy with, it seems they are simply describing in obscure scientific terms the class structure of modern societies and their long histories. But the fundamental problem lies in both the nature of the gene and of intelligence. I start to elucidate them both in the next chapter.