Two controversies about the extent and cause of group differences have erupted in the past six months. The first emerged from a Vox article about Charles Murray, race, and IQ. The second, of course, concerns the content of James Damore’s “anti-diversity memo” about gender differences in tech and his subsequent firing from Google. Both led to debates contrasting biological or genetic with cultural or social explanations for statistical differences between groups. My aim here is not to re-litigate either debate but to attempt to demonstrate, from a philosophical perspective, where certain participants’ reasoning has gone wrong. In particular, I will avoid evaluating any empirical data here, and instead simply discuss what sort of evidence various kinds of data would constitute.

Ethics, essences, politics

Unsurprisingly, a major concern in debates about group differences is ethics. Are there some ideas we shouldn’t research, or even entertain, because of how toxic they are? Are there empirical hypotheses repugnant enough that we know a priori they’re false, on ethical grounds? I’ll leave such questions aside and instead ask a different one: Assuming there are ethical issues — e.g., of racism and sexism — at stake here, how do the different sides fare against one another?

I take what is, as far as I can tell, a novel position: The sides fare equally against one another. This is because what’s generally taken to have ethical import is one’s beliefs about the data itself rather than one’s beliefs about which explanation of the data is best. Take, for example, the accusation that Damore’s memo implied that his female colleagues did not deserve their jobs. Such an implication could not be escaped simply by recourse to a cultural explanation. “As a woman, you’re worse at programming because of the patriarchy” — no female engineer would find this satisfying! The patriarchy is what tells women they’re bad at things, not what makes them bad at things. Similarly, the Vox authors who inveighed against Murray agreed that “intelligence,” as measured by IQ tests, “is meaningful,” and that “[p]eople who identify as black or Hispanic in the US and elsewhere on average obtain lower IQ scores than people who identify as white or Asian.” But in the eyes of many a layperson this is already a racist view. “As a black person, you’re less intelligent because of white supremacy” — to some, this will hardly make sense. White supremacy is the view that black people are cognitively inferior, not the cause of an actual inferiority. So moral outrage at causal etiology seems misplaced. It is not the explanans (the phenomenon’s posited cause) but the explanandum (the phenomenon itself) that offends: the proposition that women, or black people, or members of whatever targeted group, are lacking or lesser in some way.

Also misplaced is the charge of “essentialism.” In fact, whether an explanation is essentialist is completely orthogonal to whether it is biological or cultural. An essence is a set of traits that all group members have. Murray and Damore take themselves to offer non-essentialist biological explanations: They are merely discussing statistical trends; all traits show up in all populations, just at varying frequencies. And there exist essentialist non-biological explanations as well. Trans-exclusionary radical feminists often say that male-to-female transitioners lack a uniquely and universally female experience of growing up under oppressive patriarchy. Opponents of Rachel Dolezal-style transracialism argue in similar ways. The Advocate has even suggested that being gay might be a matter of ideology rather than sexual preference. Indeed, the postcolonial theorist Gayatri Spivak coined the term strategic essentialism to describe the ways in which oppressed groups posit essences for themselves in order to mobilize politically. Though Spivak herself has apparently gone sour on the practice, it remains common in “empowerment”-style identity politics.

One objection I can anticipate: It’s not the ethics of these writings per se that’s the problem, nor what they may or may not say about the personal racism or sexism of their authors, but how they’ll be used to advance or hold back some or another political agenda. One important consideration here is that the thinking behind anti-racist and anti-sexist social interventions may depend on causal etiology in a way that personal ethical concerns might not. To believe that it is a biological fact that women are worse coders than men might mean, in part, to believe that programs to promote spatial, mathematical, and technical skills among younger girls are doomed to fail in the long term. And if that’s the case, funding for them should probably be cut.

By contrast, imagine we believe that there is an IQ gap between white people and oppressed minorities, that that gap reflects a real difference in intellectual ability, and that the cause of that gap is a combination of socioeconomic factors like food insecurity, the cognitive stress of constant threat of interpersonal violence, and the self-fulfilling prophecies of white supremacist views of intelligence. If that’s the case, anti-racism would become only a more important political goal, and our view of programs to ameliorate the associated socioeconomic problems would become more and more positive. Thus the explanans — the cause posited by the theory — does seem to matter politically if not ethically.

This is a fair point, but I doubt its relevance. First, few commentators — and certainly neither Damore nor Murray — rule out environmental effects entirely, including those stemming in one way or another from discrimination. Second, even if the cause of a group difference is entirely environmental, potential interventions to improve conditions for the disadvantaged group still require empirical testing to determine their efficacy. Just as the treatment for a disease cannot always be deduced by mere logic from knowledge of its cause, so environmental interventions generally take causal etiology only as a starting point. The influence of chemicals like iodine and lead on IQ, for instance, is not something we can determine a priori from the armchair, and the relation of the unequal distribution of those chemicals to broader socioeconomic forces is complex. In other words, everyone ought to agree that it is both worthwhile and necessary to test a wide variety of intuitively viable interventions, and to implement the ones that stand up to scrutiny, insofar as implementation won’t impinge on other values.

Proof versus explanation; theoretical virtue

Discussions of the actual science of group differences can be very sophisticated, as there is an enormous amount of existing data both about the sorts of traits that might differ and about the sorts of things that might cause those differences. However, it’s important for this sophistication to be grounded in an understanding of the precise nature of the dialectical process being undertaken in the various scientific projects that bear on the question. Otherwise it risks devolving into sophistry.

I can introduce my concerns here by distinguishing deductive inference from ampliative inference. Deductive inferences are logical proofs. Socrates is a man; men are mortal; so Socrates is mortal. Nobody at the mall has a hat; I am at the mall; so I don’t have a hat. In a deductive proof, the truth of the premises are sufficient for the truth of the conclusion; the truth of the conclusion is necessary for the truth of the premises. But in science we never get deductive inferences. It seems it’s always been the case that e=mc2, but we cannot prove that it will remain so tomorrow. Bertrand Russell wrote: “The man who has fed the chicken every day throughout its life at last wrings its neck instead, showing that more refined views as to the uniformity of nature would have been useful to the chicken.” This is called the problem of induction and, like many vexing skeptical problems, it was most robustly expressed by David Hume. In ampliative inferences, as opposed to deductive ones, some consideration amplifies the premises to produce the conclusion. Induction and abduction, inference to the best explanation, are the classic examples of ampliative inference (and are probably ultimately the same thing). What’s important for us is to remember that from a deductive perspective, scientific reasoning will always be strictly fallacious. This means that when participants in a scientific debate are accused of committing fallacies, we cannot immediately judge that from a scientific perspective, their reasoning is off. Rendering that judgment requires an understanding of just what sort of argument they are making and, if it is an ampliative one, of what they take to be the theoretical virtues of their explanation.

So take Damore’s memo and the past few decades of controversy over sex differences more generally. It is not enough to say that there exist non-biological explanations for the data Damore lays out. One must show that these explanations are better than the biological ones, or else one must corroborate them experimentally. There is an apocryphal story about Pierre Simon Laplace, a French mathematician and theoretical physicist: when Napoleon asked him why he had not mentioned God in his masterwork Celestial Mechanics, Laplace responded, “I had no need of that hypothesis.” This demonstrates the key theoretical virtue of parsimony, also called simplicity or elegance: the best theories use only what they need. The alternate explanation, that God is “up there” somehow willing the stars and planets to careen through space, is not disproven, but scientific integrity requires its abandonment, at least until such a time as a case can be made for its virtues.

Damore’s opponents say, for instance, that cultural expectations can make their way into the womb through the actions of the mother and maybe even other caretakers. If this is the case, then studies of neonates don’t actually pit a biological hypothesis against an environmental one. However, the mechanism of this transmission is rarely specified, and clarity of mechanism is another theoretical virtue. How would a fetus — with no concept of gender — develop a robust sense of its gender-based “role”? Small differences in treatment are postulated. But why would they be causally relevant?

Indeed, this problem affects environmental cases more generally. Much of these debates concern burden-shifting: arguments to the effect that one or another side has the burden of proof when it comes to empirical corroboration and explication of explanatory virtue. The environmental side tends to assume that while hypothetical biological mechanisms need to be spelled out and tested in detail, hypothetical environmental mechanisms have some intuitive purchase. So, for instance, the idea that different tiny “cues” build up over boys’ and girls’ childhoods, and that these cues are sufficient to produce sex and gender differences in interests, in behavior, and maybe even in cognition — this idea is taken to be mechanistically clear, and in fact the default. But as far as I can tell it isn’t clear at all. Feminist philosopher Elizabeth Barnes has likened the process of social construction to — if memory serves — a wheel set against some sort of solid surface, in which the wheel has created grooves; every time the social pattern is adhered to, it’s also reiterated, turning the wheel again and widening the grooves. But this sort of thing is just a metaphor. To be theoretically virtuous, such notions must be operationalized, must yield empirical predictions, and must find resonances with more fundamental underlying explanations.

This brings up a crucially important distinction between the hereditarian case in the race differences debate and the biological side in the sex differences debate. Biological theories of sex differences draw on evolutionary hypotheses which seek to explain patterns seen all throughout the animal kingdom. Queen bees, peacocks’ tails, and rhesus monkeys’ reactions to toys may all fall under the same theory. Extending such a theory to humans is obviously a complex undertaking, but unless we do so, we deprive the theory of the generality that makes it truly powerful. And anyway, it is never said why humans would be an exception to such a theory. But there is no similarly unifying theory in the racial hereditarian case — at least not one that makes robust and applicable predictions.

There is one rather powerful objection the environmental theorists can make here. As noted above, few biological theorists entirely rule out the possibility that group differences stem at least partially from environmental factors. But if that’s the case, biological theories seem to incorporate at least some of the theoretical vices of environmental theories, such as their frequent obscurity with regard to mechanism. On the other hand, by being exclusive of biological factors, environmental theories seem to fare better in terms of parsimony. (Biological theorists often assume the reverse is true! — since biology is more fundamental than culture.) To overcome this, it is necessary for biological theorists of group differences to make clearer which environmental hypotheses their theories integrate.

Default hypotheses and “real science”

Lurking behind environmental hypotheses one can sometimes find a deeper skepticism about the scientific tractability of questions of group differences. The phrase “real science” sometimes comes up in these discussions. The most extended treatment of such skeptical hypotheses can be found in Eric Turkheimer’s work — throughout his career, but most recently in blog posts like “Everything is Heritable” and “Origin of Race Differences in Intelligence is Not a Scientific Question.” Turkheimer, who is a psychologist at the University of Virginia, argues, roughly, that a trait’s heritability doesn’t tell us much about whether that trait is genetic; that the only thing that could do so is focused experimentation, with control groups and so forth, and we haven’t done that (and shouldn’t). These ideas are more serious than his interlocutors (including myself) have sometimes given them credit for, and deserve an extended treatment; but for our purposes here what’s important is that they apply equally to biological and environmental hypotheses. Moreover, Turkheimer seems to think that since it’s not a “scientific” question, it can be settled on an ethical or political basis. But the first section of this article should suggest otherwise.

A different kind of default has been proposed by Richard Haier, a psychologist known for his work on the neural basis of human intelligence, not as a belief but as “a hypothesis to be tested”: “whatever the factors are that influence individual differences in IQ, the same factors would influence average group differences.” In Haier’s eyes, “[s]ince there is overwhelming evidence that genes influence the former, it would not be unreasonable to hypothesize that genes at least partially influence group differences.” But there may be a difference between a reasonable hypothesis and a reasonable belief. What is ridiculous as a belief could be perfectly good as a hypothesis if it exhibits enough of the explanatory virtues: if it’s simple enough, and especially if it’s testable in a straightforward way. Debate, then, should revolve around the question of testability.


Philosophers of science have not done a good job of addressing the real conceptual issues at stake in scientific debates about group differences. Where they have intervened, their contributions have often done more to obscure than to clarify, raising issues of “fallacious” reasoning that are inappropriate within the actual theoretical context of these debates. The foregoing constitutes only a very modest attempt, by a relative layperson, at rectifying that inadequacy. What is lacking among philosophers is, I fear, not the expertise or the cleverness or even the impartiality necessary to dart among scientific theories and to understand what sorts of claims they’re making and what sort of evidence and background assumptions they rest upon, but the courage required to address all sides critically and to entertain — out loud, or in print — the notion that comforting social science frameworks may turn out to be false, regardless of the virtues or vices of the reasoning employed by their biological antagonists. This is an area where philosophers have often seen fit not to enhance and illuminate the scientific discourse, but rather to misrepresent it and to moralize about its potential repercussions. That practice is deeply anti-intellectual, and the academy as a whole can do much better.

If you enjoy our articles, be a part of our growth and help us produce more writing for you:


  1. I actually disagree here and I think people who are skeptics need to be less disingenuous about these conversations.

    The reason why these events get major coverage is because at the core they are not merely about ethics, but also about policy. Murray has never shied away from the fact that he is against affirmative action, since it doesn’t work. A theme that makes up most of his work, even in the new book about the decline of White America. Dammore also was trying to argue about Google’s policy towards diversity.

    The problem with these arguments, especially when they come from academics or scientists is that they try to hide behind the glory of science, and act like they are somewhat detached people engaged in the politics of the country, and thus when they do get attached and smeared as is typical of politics, they act as though they are shocked at what is happening.

    That said, this has always been the major flaw of all the social sciences, it usually attracts people, whether they be on the right or the left, who have technocratic tendencies.

    1. Jevioso-

      I am afraid you are engaged in mindreading. How do you know that Damore and Murray’s policy positions were not *derived from* their research and the data, instead of the other way around?

      And frankly, even *if* someone came to research with an a priori framework that perceived affirmative action to be bad policy, couldn’t that person *also* look at the evidence in a relatively unbiased manner, just as they could engage the evidence in a relatively biased manner?

      Unfortunately, these flawed and uncharitable assumptions are regularly deployed.

  2. The “default hypothesis” was first proposed by the late UCal Berkeley Professor Arthur Jensen. It may be more accurate to say that Haier has re-proposed Jensen’s default hypothesis.

Leave a Reply