Last week, a journalist asked me a question in connection with the recent flurry of stories on changes in childhood obesity percentages in the period from 2001 to 2012. When I looked into it, what struck me was that a category defined as "BMI at or above the 95th percentile" applied to about 15-17% of the population throughout the period discussed.
This sounds like a statistical approximation to Garrison Keillor's joke about his home town, where "all of the children are above average". But the normative percentiles are based on data from an earier time, and so it's perfectly logical that 17.1% of the age-2-19 sample in the 2003-2004 period should be at or above the 95th percentile for the 1963-1994 period. This is just a symptom, after all, of the famous "obesity epidemic".
Still, I remained curious about just when this large change really took place. (Most of) the raw data is available on line from the CDC, and I decided to spend an hour or so satisfying my curiosity about what is going on here: has there actually been a gradual climb over 50 years, which looks steep when a threshold derived from 1963-1994 is used for data from 2003 to the present? Or was there a steeper climb over a narrower stretch of time?
I found a clear answer to this question. But when I looked into it further, I found some additional information that made me wonder whether there has really been any change over time at all.
Here's the punch line, in graphical form, with the data from each of 11 CDC surveys plotted against the temporal mid-point of the survey years (so that a survey spanning 1988-1994 is counted as taking place in 1991):
On the face of it, the answer seems to be that there was one regime in the 1963-1980 time period, and a different regime in the 1999-2009 time period, with a transitional value around 1991 (representing the 1988-1994 survey). However, we'll see that the two stable regimes correspond to two quite different approaches to sampling the population, with the transitional value coming from a survey with a transitional sampling procedure.
Before we speculate further about this story, let's take the time to see where the numbers came from.
The values in the graph are based on on "body mass index" (BMI) measurements in 11 national surveys carried out by the Centers for Disease Control (CDC). BMI is defined as "weight in kilograms divided by height in meters squared". In each case, the value plotted on the vertical axis is the proportional of the sample aged from 24 to 240 months, inclusive, with BMI above a threshold determined by the CDC to represent "obesity" for the individual's sex and age. This threshold was calculated as the 95th percentile (for sex and age) in an elaborately smoothed model of the distribution of BMI values in a set of samples collected between 1963 and 1994.
The first five values in my graph come from the surveys that were used in developing the normative quantiles: the Health Examination Surveys cycles II and III, and the National Health and Nutrition Examination Surveys I through III:
The other six values come from NHANES surveys run on a two-year cycle since 1999-2000. (The publicly-available data from the 2011-2012 and 2013-2014 surveys still mostly lack the relevant data fields, so my graph stops with 2009-2010).
The procedure for deriving the normative percentiles is documented in R.J. Kuczmarski et al.. "2000 CDC Growth Charts for the United States: Methods and Development", Vital and health statistics. Series 11 (246), Data from the national health survey, 2002. The numerical outcomes are listed in a .csv file linked here, which yields this table of 95th-percentile BMI-by-age numbers for males and females between 2 years (24.5 months) and 20 years (240.5 months) old . In graphical form:
In order to calculate the plot of changes over the period 1963-2009, I downloaded the raw data files for the eleven surveys, and determined what proportion of the the relevant subjects in each survey had a BMI measurement above the 95th-percentile threshold specified by those CDC 2000 norms.
The HES2, HES3, NHANES1, NHANES2, and NHANES3 numbers are all available in one big file growthch.xpt, documented here. The numbers for NHANES 1999-2000 through 2009-2010 are in individual files accessible via the links starting from this point. For each of these later six surveys, it's necessary to download and combine two .xpt files: an "examination" file which gives measurements including BMI, and a "demographics" file which gives information including age in months. The two tables are (partly) tied together by a "Respondent Sequence Number" SEQN.
So let's look at the results again:
Either there was a big change in the high end of the BMI distribution over the 1980-2000 period, or there was a big change in the CDC's measurement or sampling procedures.
A big enough change in measurement procedures seems unlikely. At least from the 1980s onward, all (?) measurements were carried out in a "mobile examination center" (MEC), using an elaborately prescribed procedure that includes regular calibration of the apparatus. I believe that the earlier studies were carried out in a similar way, and while the apparatus and procedures evolved, I don't see any reason to think that there could have been a big enough change to triple the proportion of the population above the specified 95th-percentile thresholds.
But the sampling issue is different. In the early 1980s, it was recognized that "comparable data were not available for many of the ethnic groups within the United States", and a therefore a "Hispanic HANES" was carried out in 1982-84. Its results confirmed the view that "the health status of minority groups is often different than the health status and characteristics of nonminority groups, so black Americans and Mexican Americans were selected in large proportions for NHANES III. Each group comprised 30 percent of the sample."
In 1999, the CDC began a continuously-running study, with new sampling procedures: in each two-year-cycle, "Approximately 40,000 individuals of all ages in households across the U.S. will be randomly selected to participate in the survey. The study respondents include whites as well as an oversample of blacks and Mexican-Americans. The study design also includes a representative sample of these groups by age, sex, and income level."
So NHANES III began shifting the sampling procedure towards an over-sample of previously under-sampled groups; and the subsequent surveys, begun on a regular basis in 1999-2000, have used a consistent sampling procedure that is different from NHANES III as well as from the earlier NHES and NHANES I-II surveys.
With those facts in mind, let's take a look again at my graph of the results over time, with the transitional NHANES III clearly marked:
So maybe the change over time is really a change over sampling procedures — and the increase from 5% to 17% in the proportion of the population above a normative 95th-percentile BMI-for-age measure is likewise a sampling change, not a population change.
The NHANES surveys include demographic information that can be used to re-weight the results. I didn't do any re-weighting, or pay any attention to the sampling-weight field — but I don't believe that this affects the conclusions in a material way. As a simple check, note that Table 6 from C.L. Ogden et al., "Prevalence of Childhood and Adult Obesity in the United States, 2011-2012", JAMA 2014, give the 2003-2004 obesity percentage for the age range 2-19 as 17.1%; my calculation from the unweighted raw data, for the age range 2-20, is 16.8% for females and 17.3% for males.
None of this changes the fact that the population would be healthier if people had lower BMIs. And it's plausible that there are have been population-level changes in obesity rates, though maybe on a longer time scale. But apparently what's going on in the CDC data, at least among people aged 2-20, is less of an "obesity epidemic" than an "inclusion epidemic".
My quotes about the sampling procedures come from The National Health and Nutrition Examination Survey's "Physician Examination Procedures Manual" from 2003 — documenting what was released as the 1999-2000 and 2001-2002 datasets — which tells the overall history this way (emphasis added):
This NHANES is the eighth in a series of national examination studies conducted in the United States since 1960.
The National Health Survey Act, passed in 1956, gave the legislative authorization for a
continuing survey to provide current statistical data on the amount, distribution, and effects of illness and disability in the United States. In order to fulfill the purposes of this act, it was recognized that data collection would involve at least three sources: (1) the people themselves by direct interview; (2) clinical tests, measurements, and physical examinations on sample persons; and (3) places where persons received medical care such as hospitals, clinics, and doctors’ offices.
To comply with the 1956 act, between 1960 and 1984, the National Center for Health Statistics (NCHS), a branch of the U.S. Public Health Service in the U.S. Department of Health and
Human Services, has conducted seven separate examination surveys to collect interview and physical examination data.
The first three national health examination surveys were conducted in the 1960s:
1. 1960-62 – National Health Examination Survey I (NHES I)
2. 1963-65 – National Health Examination Survey II (NHES II)
3. 1966-70 – National Health Examination Survey III (NHES III)
NHES I focused on selected chronic disease of adults aged 18-79. NHES II and NHES III
focused on the growth and development of children. The NHES II sample included children aged 6-11, while NHES III focused on youths aged 12-17. All three surveys had an approximate sample size of 7,500 individuals.
Beginning in 1970 a new emphasis was introduced. The study of nutrition and its relationship to health status had become increasingly important as researchers began to discover links between dietary habits and disease. In response to this concern, under a directive from the Secretary of the Department of Health, Education and Welfare, the National Nutrition Surveillance System was instituted by NCHS. The purpose of this system was to measure the nutritional status of the U.S. population and monitor nutritional changes over time. A special task force recommended that a continuing surveillance system include clinical observation and professional assessment as well as the recording of dietary intake patterns. Thus, the National Nutrition Surveillance System was combined with the National Health Examination Survey to form the National Health and Nutrition Examination Survey (NHANES). Four surveys of this type have been conducted since 1970:
1. 1971-75 – National Health and Nutrition Examination Survey I (NHANES I)
2. 1976-80 – National Health and Nutrition Examination Survey II (NHANES II)
3. 1982-84 – Hispanic Health and Nutrition Examination Survey (HHANES)
4. 1988-94 – National Health and Nutrition Examination Survey (NHANES III)
NHANES I, the first cycle of the NHANES studies, was conducted between 1971 and 1975. This survey was based on a national sample of about 28,000 persons between the ages of 1-74. Extensive data on health and nutrition were collected by interview, physical examination, and a battery of clinical measurements and tests from all members of the sample.
NHANES II began in 1976 with the goal of interviewing and examining 28,000 persons between the ages of 6 months to 74 years. This survey was completed in 1980. To establish a baseline for assessing changes over time, data collection for NHANES II was made comparable to NHANES I. This means that in both surveys many of the same measurements were taken in the same way, on the same age segment of the U.S. population.
While the NHANES I and NHANES II studies provided extensive information about the health and nutritional status of the general U.S. population, comparable data were not available for many of the ethnic groups within the United States. Hispanic HANES (HHANES), conducted from 1982 to 1984, produced estimates of health and nutritional status for the three largest Hispanic subgroups in the United States—Mexican Americans, Cuban Americans, and Puerto Ricans—that were comparable to the estimates available for the general population. HHANES was similar in design to the previous HANES studies, interviewing and examining about 16,000 people in various regions across the country with large Hispanic populations.
NHANES III, conducted between 1988 and 1994, included about 40,000 people selected from households in 81 counties across the United States. As previously mentioned, the health status of minority groups is often different than the health status and characteristics of nonminority groups, so black Americans and Mexican Americans were selected in large proportions for NHANES III. Each group comprised 30 percent of the sample. NHANES III was the first survey to include infants as young as 2 months of age and to include adults with no upper age limit. To obtain generalizeable estimates, infants and young children (1-5 years) and older persons (60+ years) were sampled at a higher rate than previously. NHANES III also placed an additional emphasis on the effects of the environment upon health. Data were gathered to measure levels of pesticide exposure, presence of certain trace elements in the blood, and amounts of carbon monoxide present in the blood. A home examination was incorporated for those persons who were unable or unwilling to come to the exam center but would agree to an abbreviated examination in their homes.
This NHANES follows in the tradition of past NHANES surveys, continuing to be a keystone in providing critical information on the health and nutritional status of the U.S. population.
The major difference between the current NHANES and previous surveys is that the current NHANES is conducted as a continuous, annual survey. Each single year and any combination of consecutive years of data collection comprises a nationally representative sample of the U.S. population. This new design allows annual statistical estimates for broad groups and specific race-ethnicity groups as well as flexibility in the content of the questionnaires and exam components. New technologic innovations in computer-assisted interviewing and data processing result in rapid and accurate data collection, data processing, and publication of results.
The number of people examined in a 12-month period will be about the same as in previous NHANES, about 5,000 a year from 15 different locations across the nation. The data from the NHANES are used by government agencies, state and community organizations, private researchers, consumer groups, companies, and health care providers.
Data collected on the current NHANES survey began early in 1999 and will continue for approximately 6 years at 88 locations (stands) across the United States. The survey was preceded by a pretest in the spring of 1998 and a dress rehearsal was conducted in early 1999. Approximately 40,000 individuals of all ages in households across the U.S. will be randomly selected to participate in the survey. The study respondents include whites as well as an oversample of blacks and Mexican-Americans. The study design also includes a representative sample of these groups by age, sex, and income level. Adolescents, older people, and pregnant women are also oversampled in the current NHANES.