So, what is it that you do? Part One.

It’s dense, y’all.  So here’s the first dose.

It’s about race and health in public health research.

The U.S. is a multi-racial, multi-ethnic society, so we use race as a variable in all of our research.  We do this partially because of the fact that racial differences persist in virtually every area of health interest, and partially because of convention – we publish statistics stratified by race, we control for race in research models, and we exclude individuals from analysis on the basis of race.  What we (‘we,’ meaning me and my colleagues of health researchers… if I might take that presumptuous leap of status) don’t do is stop to question whether race is really an appropriate construct – what it means, what it really differentiates, and what it ultimately suggests.

This is really important because the use of race in public health research is very problematic.  The idea is that using race categories controls for some sort of undisclosed differences in population genetics… or in fancier talk, the epidemiologic assumption is that there is a genotypic difference that is being controlled.  But in reality, researchers aren’t in the practice of, say, taking gene frequency measures in their participants.  And more to the point: they aren’t even in the practice of defining the criteria for assigning a person in one racial category to another.

Well, if you’re still with me, you might be asking about the standard.  Because, surely, our medical researchers have come up with some hard and fast rule about the biologic concept of race in medicine.


And as much as population geneticists will jump up and down screaming about things like ‘continental racial categories’ and the higher incidence of genetically-related disease in certain groups (say, sickle cell) – the bottom line?  All our genome work has us coming back again and again to say that genetically, we’re all pretty much the same.

Richard Cooper (an MD and Epidemiologist at Loyola Med School in Chicago) is sort of the Master and Commander of this discourse and I’d be remiss to try and restate what he says so darn clearly:

Racial differences reflect different social environments, not different genes, even where two groups live side by side, as do blacks and whites in the United States.  Race does not mark in any important way for genetic traits; rather, it demonstrates beyond question the paramount role of the social causes.  We have much more to learn from that paradigm, rather than the one offered by ethnogenetics.

In short, when we’re studying race, we’re really not studying genotypic differences – we’re studying phenotypic differences.  (e.g.: the differences that result in our environments, not our genetics.)
Okay then, but public health uses race all the time and finds all sorts of interesting results.  What does all that mean??

For one, it means that the results might be screwy.  The majority of public health research occurs statistically: where a model full of complex and overwhelming Greek letters spell out a variety of things (the independent variables) that predict what happens to an outcome (the dependent variable).  Race is most often used as a dummy, or binary, variable – meaning that you are either black or white – so the lack of conceptual clarity about what in the world each of those categories means leaves a great deal of room for error… if you aren’t controlling for something very clearly within your model, it means that your variable is open to error.  It could be measuring the effects of other things in your model, including things in the error term.  This means it could be “endogenous,” which, in public health research, is a Really. Bad. Thing.  Suggesting that using race as a binary variable presents a problem of endogeneity to statistical models is sort of like saying that that ‘vegetarian’ gravy your Mom has been feeding you for all your 20 years of vegetarianism is actually made from 6 different animals.  It ruins everything you’ve ever done with it and colors your ability to use it in the future.  It’s better to just not know.  Or to ignore the reality.  Or!  To reinvent it!

Like, for example, saying that race doesn’t really mean what we think it means.  Let’s get real, you say, we know that race is all messy!  So when we’re talking about race disparities in health, we’re actually measuring other things… you know, like socioeconomic status, discrimination, cultural factors, stuff like this that we know have a racial component.

That’s all fine and good, I answer, but public health models shouldn’t be proxy for anything not clearly defined.  That’s not good science.  It’s more logic to argue that if race is a proxy for other factors, then we need to find better ways of measuring those other factors.  If we’re going to intervene effectively, we need to clearly understand what is going on.

Let me give an example.  Let’s say that you are a health researcher and you’re studying prenatal care utilization.  You’ve got a great regression model controlling for a variety of factors and your results show a statistically significant coefficient for the race binary variable (that the mean number of visits is higher for whites than for blacks, even when you’re controlling for things like income, age, insurance status, etc.)  You might fall into the trap of reporting (as is embarrassingly common in published research) that “race is a significant determinant of prenatal care utilization.”  Think about that for a minute.  The color of one’s skin has nothing to do with how many times someone sees the doctor.  How the world around someone reacts to them due to the color of their skin (or other individual factors) may very well impact how many times they attend a prenatal visit… but that is not what the model is measuring, nor what the data is suggesting!

Further, if you go along that route, you may filter that finding down to medical and public health practice.  It may be unintentional or even unrealized, but your intervention could be focused on race, trying to address whatever it is about being black that means you go to the doctor less.  You may not even think to see what is going on with the doctor, or the clinic, or the system because you’re so focused on intervening in on that race factor… and you’d be missing the point.

Public health science needs better conceptual precision about the measurement of race, period.  At the very least, the lesson here is that we need to be clear on what we’re measuring and how we’re interpreting it.