Hostname: page-component-78c5997874-j824f Total loading time: 0 Render date: 2024-11-10T11:04:30.681Z Has data issue: false hasContentIssue false

Methodological Challenges in Estimating the Lifetime Medical Care Cost Externality of Obesity

Published online by Cambridge University Press:  27 July 2021

Robert C. Schell*
Affiliation:
School of Public Health, University of California at Berkeley, 2121 Berkeley Way 5302, Berkeley, CA94720, USA
David R. Just
Affiliation:
Charles H. Dyson School of Applied Economics and Management, Cornell University, 137 Reservoir Ave, Ithaca, NY14850, USA
David A. Levitsky
Affiliation:
College of Human Ecology, Cornell University, Martha Van Rensselaer Hall, Ithaca, NY14850, USA
Rights & Permissions [Opens in a new window]

Abstract

There is a great deal of variability in estimates of the lifetime medical care cost externality of obesity, partly due to a lack of transparency in the methodology behind these cost models. Several important factors must be considered in producing the best possible estimate, including age-related weight gain, differential life expectancy, identifiability, and cost model selection. In particular, age-related weight gain represents an important new component to recent cost estimates. Without accounting for age-related weight gain, a study relies on the untenable assumption that people remain the same weight throughout their lives, leading to a fundamental misunderstanding of the evolution and development of the obesity crisis. This study seeks to inform future researchers on the best methods and data available both to estimate age-related weight gain and to accurately and consistently estimate obesity’s lifetime external medical care costs. This should help both to create a more standardized approach to cost estimation as well as encourage more transparency between all parties interested in the question of obesity’s lifetime cost and, ultimately, evaluating the benefits and costs of interventions targeting obesity at various points in the life course.

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press on behalf of the Society for Benefit-Cost Analysis

1. Introduction

As the cost and prevalence of obesity continue to soar, it has become more important than ever to produce accurate estimates of the lifetime medical care cost externality. While an abundance of estimates can be found in the literature, the vast majority assume a person remains at the same body mass index (BMI), for their entire lives (Thompson et al., Reference Thompson, Edelsberg, Colditz, Bird and Oster1999; Yang & Hall, Reference Yang and Hall2007; Finkelstein et al., Reference Finkelstein, Trogdon, Brown, Allaire, Dellea and Kamal-Bahl2008). This assumption is contrary to fact. Recent work by Fallah-Fini et al. (Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017) and Schell et al. (Reference Schell, Just and Levitsky2020) has found that age-related weight gain – the fact that a person’s BMI tends to increase significantly as they age accounts for the majority of costs associated with obesity (Fallah-Fini et al., Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017; Schell et al., Reference Schell, Just and Levitsky2020). There are numerous other factors that explain the divergence of cost estimates between studies, including the selection of an appropriate cost model and accounting for differential life expectancy, that are essential to producing accurate and consistent estimates. Additionally, while all of the current lifetime cost studies are associational, recent developments in the use of instrumental variables in the context of obesity research could allow future models to produce credibly identified causal estimates.

For the purpose of cost-benefit analysis, it is important to consider the factors necessary for policymakers to judge the merits of any anti-obesity intervention and to incorporate the best available information. Specifically, a policy-relevant estimate should (i) cover obesity’s costs over the life course, (ii) focus on third-party costs (the externality imposed on others), and (iii) account for changes in BMI over time. Each of these points is developed in the following section.

While most studies on the costs associated with obesity are cross-sectional, policymakers deciding whether a specific obesity intervention provides sufficient benefit at an acceptable cost need to understand the full scope of obesity’s costs over the course of a person’s life. Many of obesity’s sequelae are latent, with higher medical care costs sometimes not appearing for decades (Must & Strauss, Reference Must and Strauss1999). For this reason, an estimate of obesity’s costs at one particular time point provides little insight regarding its true burden, which evolves over a lifetime. Often these costs are lumpy and concentrated around the end of life. Despite the data limitations discussed later in this paper, researchers should focus on producing cost estimates over the longest time possible in order to quantify the complete monetary benefit of interventions.

People with obesity bear only 15 % of their own medical costs, with public and private insurers covering the rest (Wang et al., Reference Wang, Pamplin, Long, Ward, Gortmaker and Andreyeva2015). The question of whether this represents a genuine externality on private insurance is debated, with some researchers arguing that people with obesity pay the price for their condition in the form of higher premiums and lower wages (Bhattacharya & Sood, Reference Bhattacharya and Sood2011). However, the impact of obesity on public insurance, which does not allow for differences in premiums based on medical risk and is not tied to employment, remains a substantial externality. Obesity has an immense impact on public insurance, with the costs associated with obesity representing at least 9.5 % of total Medicare expenditures (Finkelstein et al., Reference Finkelstein, Fiebelkorn and Wang2003; Bhattacharya & Sood, Reference Bhattacharya and Sood2011). Because of the substantial burden of obesity’s medical costs on society through public and private insurance, policymakers and researchers should emphasize the third-party, or external, cost of obesity when considering the cost-benefit of anti-obesity interventions. Age-related weight gain is a persistent phenomenon found in the USA, where people tend to gain weight with a concomitant rise in adiposity as they age (Williams & Wood, Reference Williams and Wood2006). Despite this fact, relatively few lifetime cost estimates of obesity factor age-related weight gain into their models (Finkelstein et al., Reference Finkelstein, Trogdon, Brown, Allaire, Dellea and Kamal-Bahl2008; Fallah-Fini et al., Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017; Schell et al., Reference Schell, Just and Levitsky2020). To understand the sheer complexity of modeling BMI development over time, it is instructive to think about what causes a person to develop obesity in their lifetime.

Obesity occurs as the result of a gradual process during which daily caloric surpluses compound over time. The static model of weight gain, which assumes 3500 surplus calories result in a pound of weight gain, fails to reflect metabolic changes from weight loss or weight gain that create a nonlinear relationship between caloric surplus and weight gain over time (Hall, Reference Hall2007). Inputting such minute and dynamic information in a model to predict whether an adolescent will become an adult with obesity is clearly infeasible. This issue is exacerbated by the fact that no dataset in the US covers a representative sample of Americans’ weight gain trajectories from birth to death, so we must use multiple datasets to monitor BMI state transitions. As a result, researchers must rely on simplifying assumptions to effectively monitor BMI growth curve progression and most have decided to use a Markov model (Tucker et al., Reference Tucker, Palmer, Valentine, Roze and Ray2006; Ma & Frick, Reference Ma and Frick2011; Sonntag et al., Reference Sonntag, Ali, Lehnert, Konnopka, Riedel-Heller and König2015; Fallah-Fini et al., Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017). While Markov models have been widely used in the estimation of BMI growth curves, we found most articles explain relatively little of their methodology or why the Markov model provides an ideal fit for the problem. This manuscript seeks to demystify the recent literature on the lifetime social costs of obesity by detailing the advantages and pitfalls of applying a Markov model to measure age-related weight gain, possibilities for causal inference in future models, discussing methodological considerations in adopting the appropriate cost model, and finally demonstrating how to account for differences in life expectancy between people with obesity and normal weight people. We conclude by discussing data requirements, how age-related weight gain affects cost estimates, and limits to existing estimates and data availability.

2. Literature review

Tucker et al. first accounted for age-related weight gain relying on data from Burton et al. (Reference Burton, Chen, Schultz and Edington1998) to project the cost from 20 to death (Tucker et al., Reference Tucker, Palmer, Valentine, Roze and Ray2006). This work used a multitude of sources to derive cost data, which forces heavy reliance on the background assumptions of these prior cost estimates. Nonetheless, Tucker et al. (Reference Tucker, Palmer, Valentine, Roze and Ray2006) pioneered the use of a semi-Markov state-transition model by using simulated cohorts with a BMI range of 24–45 and ages 20–65 all while accounting for life expectancy discrepancies. They relied on Heo et al.’s (Reference Heo, Faith, Mott, Gorman, Redden and Allison2003) estimates of age-related weight gain to control for the effect of age-related weight gain on cost. This curve was estimated using a hierarchical linear model to piece together a variety of older data sources to predict BMI by age and sex, creating a generic, though perhaps outdated, age-related weight gain curve (Heo et al., Reference Heo, Faith, Mott, Gorman, Redden and Allison2003).

In 2010, Wang et al. (Reference Wang, Denniston, Lee, Galuska and Lowry2010) selected a cohort aged 16–17 from whom to estimate lifetime cost (Wang et al., Reference Wang, Denniston, Lee, Galuska and Lowry2010). They relied on the life expectancy estimates of Finkelstein et al. (Reference Finkelstein, Trogdon, Brown, Allaire, Dellea and Kamal-Bahl2008) and applied a two-part model (2PM) – similar to a double hurdle model – using a logit model to describe the probability of falling into a BMI range and a generalized linear model (GLM) with a log link to estimate medical costs. They use only the 2000 Medical Expenditure Survey (MEPS) data for costs after age 40 in the second stage estimation. In estimating the weight gain curve, they employed the 1979 NLS Survey of Youth, and specifically focused on the older cohort in this survey. Thus, given the timing, it may be questionable whether the weight gain trajectory faced by children today is represented well within their data. The methodology was simple, with only two age points where BMI was observed, and a basic linear regression was used to derive coefficient estimates. The most apparent limitations of this study are the age of the data, the use of only two age points to track BMI trajectories, and only looking at costs after age 40 based on the assumption of constant BMI after age 40. Wang et al. (Reference Wang, Denniston, Lee, Galuska and Lowry2010) provided the first age-related weight gain curve produced from its own data and assumptions.

Fallah-Fini et al. (Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017) produced a cost estimate complete with an age-related weight gain curve covering a simulated cohort from young adulthood to death (Fallah-Fini et al., Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017). Fallah-Fini et al. (Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017) used a Markov model that found lifetime third party costs for a person with obesity occur mostly later in life (Fallah-Fini et al., Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017). Fallah-Fini et al.’s (Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017) age-related weight gain estimate relied on the Coronary Artery Disease Risk in Adults (CARDIA) study to estimate weight gain below age 45, and the Atherosclerosis Risk in Communities (ARIC) study for weight gain above 45. Fallah-Fini et al. (Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017) applied a Markov model to measure both comorbidity state transition and BMI state transition, with 15 states covering each of the most popular BMI classifications and predominant comorbidities of obesity. Costs were derived both from the distribution of costs by type in MEPS and published cost data. Unfortunately, the study only considers the comorbidity with the highest cost even if the subject has multiple comorbidities, which likely substantially understates true costs. Additionally, the study only accounts for four obesity-related outcomes and relies heavily on an assumption of disease independence. Using the most recent data available, they demonstrated that state transitions most significantly burdened third-party payers, perhaps providing a welfare economic justification for regulating externalities.

3. Theoretical exposition

3.1 Estimating the age-related weight gain curve

A dependent variable conditioned only on its own previous state exhibits the “Markov Property,” which makes observations before the one period prior irrelevant to estimation. A Markov process is “memoryless” in the sense that the only data used to create an estimate come from the previous state, which means the process can be represented by the first order difference equation below (Hamilton, Reference Hamilton1994; Lay et al., Reference Lay, Lay and McDonald2016):

(1) $$ {\varepsilon}_{t+1}={P}_{t+1}^{\varepsilon_t}{\varepsilon}_t+{v}_{t+1}, $$

where in the case of BMI, $ {\varepsilon}_{t+1} $ is BMI at age t + 1t is BMI at age t, and $ {v}_{t+1} $ is a random component, which should have a mean of zero and some finite variance (Hamilton, Reference Hamilton1994). Here, $ {P}_{t+1}^{\varepsilon_t} $ is a coefficient to be estimated representing systematic weight gain between age $ t $ and t + 1 of an individual who is in BMI class εt at age t.

This modeling technique has a history in the study of BMI transition estimation (Tucker et al., Reference Tucker, Palmer, Valentine, Roze and Ray2006; Ma & Frick, Reference Ma and Frick2011; Sonntag et al., Reference Sonntag, Ali, Lehnert, Konnopka, Riedel-Heller and König2015; Fallah-Fini et al., Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017). However, these studies generally do not discuss the Markov model’s functioning at length. We provide a working example to understand both how the Markov model provides an ideal approach to estimating age-related weight gain and how it can be implemented over the course of a lifetime. Before exploring the model’s use over a lifetime, a working example may help to demonstrate the advantages of the Markov modeling approach (Fosler-Lussier, Reference Fosler-Lussier1998). We will rely on five broadly agreed-upon BMI categories for transition: underweight (under 18 kg/m2), normal weight (from 18 to 25 kg/m2), overweight (from 25 to 30 kg/m2), obese (from 30 to 35 kg/m2), and morbidly obese (greater than or equal to 35 kg/m2) (Bhaskaran et al., Reference Bhaskaran, Douglas, Forbes, dos-Santos-Silva, Leon and Smeeth2014). While these categories simply reflect clinical standards, the fact that clinical interventions generally rely on these categories as treatment thresholds and previous research has demonstrated discontinuities in the cost of weight gain by category suggests these categories are informative (Cawley & Meyerhoefer, Reference Cawley and Meyerhoefer2011; Heymsfield et al., Reference Heymsfield, Aronne, Eneli, Kumar, Michalsky, Walker, Wolfe, Woolford and Yanovski2018). However, we also note the shortcomings of BMI and these categories more generally as a measure of adiposity given that they lead to misclassification of people with high lean body mass or high adiposity but lower weights (Prentice & Jebb, Reference Prentice and Jebb2001). The lack of available alternatives in national surveys means that these imperfect proxies are perhaps the best option. Five different iterations of the model (one for each BMI category one chooses) would exist in total. The effect of previous states on the current state without applying the Markov property can be represented by the conditional probability $ P\left({State}_n|{State}_{n-1,\dots, }{State}_1\right) $ .

Even over a relatively short period of time, this method of using all the data available quickly becomes intractable. For instance, if we considered each year of life separately then after only 5 years, we would need 55 or 3125 past histories to compute current BMI using the past data available. Clearly, over a lifetime a more parsimonious method is required. Applying the Markov property, for which one needs only the most recent state to predict the current state, we could treat the 5 years as an interval, and this results in needing information for only 52 or 25 past histories regardless of the length of time measured. This more manageable model is interested only in $ P\left({State}_n|{State}_{n-1}\right) $ .

We now consider whether it is reasonable to group BMI transitions over the course of a few years. One of the most remarkable facts of the current obesity epidemic is its insidious and persistent nature: over a few years, people generally do not gain much weight. But weight builds over long periods of time through a process called “age-related weight gain” (Burke et al., Reference Burke, Bild, Hilner, Folsom, Wagenknecht and Sidney1996; Gokee LaRose et al., Reference Gokee LaRose, Tate, Gorin and Wing2010). Because of the relative intractability and consistency of age-related weight gain, a subject’s weight in the last time period likely correlates extremely powerfully with their current weight (Must & Strauss, Reference Must and Strauss1999). An adult with obesity or overweight rarely goes back down to a lower BMI category, and almost never sustains this weight loss, which makes their history of weight prior to the latest period largely irrelevant (Daviglus et al., Reference Daviglus, Liu, Yan, Pirzada, Manheim, Manning and Garside2004).

It is desirable to have the time interval between observations as short as possible to avoid multiple long term weight transitions taking place between observations. Anywhere from 1 to 3 years should yield a sufficiently short time period to avoid multiple state transitions at once (Burke et al., Reference Burke, Bild, Hilner, Folsom, Wagenknecht and Sidney1996). This recommended range is based upon the assumption that permanent BMI changes only gradually from the previous time period and multiple state transitions in such a short period of time seem unlikely. These state transitions allow the researcher to form a stochastic state transition probability matrix, from which the probability of shifting from one BMI category to another based on current BMI and age can be elucidated over the course of every subject’s life. As a result, using intervals of only a few years, one should be able to capture the vast majority of BMI state transitions, while also easing data requirements considerably.

In contrast to much of the published literature, we recommend taking a nonparametric approach by allowing the data to dictate the formation of the state transition probability matrices for each age group. Alternative approaches must rely on previous estimates or a rigid functional form for a BMI-age curve. Using a rigid functional form for BMI-age can introduce misspecification bias. Using others’ estimates for this curve puts one at the mercy of bias from the imperfect study designs of others and unverifiable assumptions that often make less sense today than when those studies were current (Fernandes, Reference Fernandes2010; Sonntag et al., Reference Sonntag, Ali, Lehnert, Konnopka, Riedel-Heller and König2015; Fallah-Fini et al., Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017). Some have argued the country’s recent secular trend in weight gain, wherein people of every age weigh more than they did previously, could introduce bias when employing a Markov model as estimates may pick up aggregate rather than individual effects (Massachusetts Medical Society et al., Reference Majeed2017). However, because these curves will be constructed from observations of real people over time and, hence, will control for individual-invariant fixed effects, such estimates serve the purpose of accurately modeling real world weight gain trajectories.

3.2 Limitations of the Markov model in estimating age-related weight gain

There are several shortcomings of the Markov model that are important to consider. The use of the model is motivated by the relative lack of datasets combining lifetime cost and BMI information available in the US. If a dataset were to exist that allowed for an estimate of age-related weight gain over the entire lifespan of Americans as well as the associated medical costs, a curve created from the entire history of their weight gain trajectories would be more appropriate. The necessity for the Markov property to hold is the primary limitation of the model, as an individual’s past beyond the last few years could provide important information regarding their propensity to gain or lose weight over time. Additionally, the current iteration of the Markov model employed in the literature relies on data from a different nationally representative sample than the one from which the cost of obesity is estimated, which could result in a misstatement of the relationship between medical cost and weight gain over time.

3.3 Identifiability in estimating obesity’s lifetime costs

One of the main limitations of the current literature on obesity’s lifetime costs is that all of the studies rely on observational data, finding associations rather than causal estimates. Cawley and Meyerhoefer (Reference Cawley and Meyerhoefer2011) implements an instrumental variable approach to isolate exogenous variation in BMI on a cross-sectional estimate that could also be used on longitudinal data (Cawley & Meyerhoefer, Reference Cawley and Meyerhoefer2011). Causal approaches to obesity’s estimation will undoubtedly prove vital given the tendency for poorer people and minorities to utilize healthcare at lower rates. Both of these groups tend to have higher than average risk of obesity. This means that endogeneity in the estimation of obesity’s costs likely causes a severe underestimate of the true value. Therefore, Cawley and Meyerhoefer (Reference Cawley and Meyerhoefer2011) proposes the use of an instrument – the weight of an adult’s oldest biological child – to isolate random variation in weight. While an instrumental variable approach estimates only a Local Average Treatment Effect (LATE) and so can lack generalizability to a larger population, it could prove a vital innovation for future causal estimates of obesity’s lifetime costs. The burgeoning literature on mendelian randomization, where germline genetic variation acts as an instrumental variable, could also provide a valuable new tool through which to produce models that account for confounding and selection bias (Dixon et al., Reference Dixon, Hollingworth, Harrison, Davies and Smith2020; Kurz & Laxy, Reference Kurz and Laxy2020).

3.4 Estimating differential life expectancies

Naturally, a lifetime cost estimate must account for the possibility that people with obesity do not live as long as their normal weight counterparts (Flegal et al., Reference Flegal, Graubard, Williamson and Gail2005; Finkelstein et al., Reference Finkelstein, Brown, Wrage, Allaire and Hoerger2010; Abdelaal et al., Reference Abdelaal, Roux and Docherty2017). Time censoring and a skewed distribution make survival analysis, and therefore quantifying life expectancy, a difficult statistical issue that requires specific techniques (Clark et al., Reference Clark, Bradburn, Love and Altman2003). Unfortunately, the existing literature on the life expectancy penalty resulting from obesity relies on largely older data and has provided equivocal results. In fact, some studies even found an “obesity paradox” among Black males, wherein people with obesity outlive those of normal weight (Tucker et al., Reference Tucker, Palmer, Valentine, Roze and Ray2006). This undoubtedly has to do with some form of endogeneity affecting both BMI status and life expectancy. It would be difficult to fully account for such an issue in any study. However, to provide an authoritative and more recent view of the subject, we propose using the most recent life expectancies available with proportional hazards generated from National Health Interview Survey (NHIS) data and official life tables.

While other common approaches to survival analysis exist (such as Kaplan-Meier Curves), we recommend the cox proportional hazards model both because of its ubiquity in medical research and ability to account for other covariates, in particular smoking and age at entry into the dataset (Clark et al., Reference Clark, Bradburn, Love and Altman2003). Unlike other proportional hazard approaches, this model is estimated nonparametrically at baseline, which reduces the probability of misspecification. The only major assumption, proportional hazards – where hazard rates between groups do not cross conditional on covariates – is verifiable. Even if these assumptions are not met, one can interact hazard and age to allow for time-dependent covariates that may have caused nonproportionality (Bradburn et al., Reference Bradburn, Clark, Love and Altman2003).

Nevertheless, the proportional hazards approach has come under recent scrutiny for its trouble estimating small risks accurately and the improbability of its assumptions holding in sufficiently large samples (Moolgavkar et al., Reference Moolgavkar, Chang, Watson and Lau2018; Stensrud & Hernán, Reference Stensrud and Hernán2020). We argue in this context such concerns remain relatively unimportant because the relative risk of obesity at younger ages is considerable. Specifically, the relative risk of mortality for being a person with obesity compared to normal weight at younger ages tends to exceed 2, which is well beyond the “small risks” discussed by Moolgavkar et al. (Reference Moolgavkar, Chang, Watson and Lau2018)

The official U.S. Lifetables published by the National Center for Health Statistics (NCHS) provide age-specific death probabilities by gender (Arias & Xu, Reference Arias and Xu2015). Unfortunately, these data do not account for specific BMI categories or smoking status, the largest health-behavior based potential confounder in life expectancy. As a result, one can use NHIS data linked calculated hazards with corresponding linked mortality files (LMFs) in the National Death Index. The Cox proportional hazards model can be written as

(2) $$ h(t)={h}_0(t){e}^{\left({X}^{\prime}\beta \right)}, $$

where h(t) is the hazard of death at time t, $ {\displaystyle \begin{array}{l}{h}_0(.)\end{array}} $ is a base time curve adjusted by the exponential factor with $ {X}^{\prime}\beta $ as a set of covariate controls. This presumes hazards at any particular age are proportional given a specific set of covariates. The output produced by a Cox proportional hazards model, a hazard ratio, illustrates the changed hazard of an outcome occurring from a change in characteristics. For instance, a hazard ratio of three for a person with obesity would suggest that they have triple the chance of dying that year compared to a reference normal weight person with otherwise identical attributes. As a result, applying the Cox proportional hazards model to a lifetable allows for a researcher both to control for potential confounders and to address directly the impact of obesity on life expectancy. Thus, in order to determine BMI’s effects on life expectancy by age, one must apply the hazard ratios of each BMI and smoking category (and any other important confounder) to the unadjusted probability of death at any age given by the lifetables.

Depending on the data source, one could find points at which the proportional hazards assumption would likely be violated (Fontaine et al., Reference Fontaine, Redden, Wang, Westfall and Allison2003; Finkelstein et al., Reference Finkelstein, Trogdon, Brown, Allaire, Dellea and Kamal-Bahl2008). There are two common methods for handling disproportionality – an interaction between the violating variable and time, and stratification by the violating variable (Grambsch & Therneau, Reference Grambsch and Therneau1994). Because stratification does not allow for estimation of a parameter value (and we need such a value to accurately assess life expectancy effects) and creates less efficiency given the artificial constraint on the information available to the researcher, we propose adding interaction terms between age and each violating variable. This adjustment also makes intuitive sense, as BMI’s impact differs markedly based on one’s age and time spent in each state, so age interactions should provide a more precise estimation of its impact on survival probability over time. To confirm the logic of this intuition, we propose conducting likelihood ratio tests for the interaction model and Wald tests for the joint significance of the interaction variables and analyzing Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) criteria to assess goodness of fit. After fitting this model, one simply averages the association of obesity and mortality across smoking statuses, after which estimating a median life expectancy simply requires taking the product of the probability of mortality every year times the relative risk for people with obesity.

3.5 Limitations of the Cox proportional hazards model for life expectancy

There are several important limitations to applying the Cox proportional hazards model. Firstly, the Cox proportional hazards model makes an assumption, proportional hazards, where the hazard ratio between individuals remains constant over time, that is likely to fail in practice. While there are steps to remedy this, the assumption represents an imperfect parameterization of an individual’s survival and can result in statistical bias in the model. New techniques employing nonparametric methods for survival analysis could be explored by future researchers to relax this simplifying assumption. Perhaps the most important problem that plagues any survival model is the persistence of endogeneity, for which no simple solution exists. The wide variety of estimates of survival differences between people with obesity and normal weight people could be driven in part by differences between these individuals not attributable to obesity. Lastly, the impact of obesity on survival could be contingent on the existence of other medical conditions, like pre-existing heart issues, which means the true effect could exhibit significant heterogeneity.

3.6 Estimating obesity-related costs

One of the persistent concerns when modeling healthcare data is the large number of subjects who face no medical expenditures in a given year (Buntin & Zaslavsky, Reference Buntin and Zaslavsky2004). Additional methodological issues include the data’s strict non-negativity and, due to the presence of some patients with exceptionally high medical costs, the highly skewed nature of the data. The linear conditional expectation and normality assumptions typically specified in ordinary least squares regressions are severely violated, and the literature suggests a wide variety of potential solutions to these issues. This includes log models, 2PM, and GLM. We propose the use of a 2PM estimating the probability of any medical expenditures and then the total medical expenditures conditional on having any. We propose this 2PM both because of its reliance on more reasonable assumptions and history of use in the existing literature allowing for comparability (Thorpe et al., Reference Thorpe, Florence, Howard and Joski2004; Wee et al., Reference Wee, Phillips, Legedza, Davis, Soukup, Colditz and Hamel2005; Yang & Hall, Reference Yang and Hall2007; Bell et al., Reference Bell, Zimmerman, Arterburn and Maciejewski2011; Cawley & Meyerhoefer, Reference Cawley and Meyerhoefer2011; Trogdon et al., Reference Trogdon, Finkelstein, Feagan and Cohen2012). The large number of subjects with no medical expenditures in any given year could otherwise severely bias the results without separately accounting for the possibility of no medical expenditure.

For the discrete part of the model, the dependent variable, whether a subject reports any medical expenditure, will have a generalized binomial distribution (Nelson et al., Reference Nelson, Story, Larson, Neumark-Sztainer and Lytle2008). In order to regress a binary variable, one can use a generalized linear model, where a link function transforms the binomial distribution into an approximately normal distribution. The two most popular versions of this model are the Logit and Probit models. There are relatively few practical differences between these models: a logistic error term’s distribution normally has a higher kurtosis, and interpretations of the coefficients vary. There are also slight differences in model fit. Because of their similarity, we propose the use a logit model due to its marginally superior fit and comparative ease of interpretation.

Determining an appropriate functional form of the second part of the model proves more nuanced. The two most commonly used modeling approaches are a GLM with a Gamma Log Link and a Logged Ordinary Least Squares (OLS) regression. The discussions regarding this model choice can often feel a bit murky. A helpful analogy to understand the distinction between a GLM and an OLS model is the difference between a rectangle and square. An OLS model is a GLM that requires a normal (or in this case lognormal) distribution, like how a square is a rectangle that requires equal length sides. In the general case a GLM, like how a rectangle does not require a square’s assumptions, can fit both normal and non-normal distributions. Thus, the GLM requires no retransformation into a normal distribution and has more relaxed assumptions than an OLS model, but provides less statistical efficiency as a result of the fewer assumptions regarding functional form (Manning & Mullahy, Reference Manning and Mullahy2001; Buntin & Zaslavsky, Reference Buntin and Zaslavsky2004).

The gamma distribution allows for the modeling of non-negative data without the need for a smearing retransformation. A smearing retransformation is done because logging variable results in a shift in the distribution that must be accounted for when returning to the original scale. Meanwhile, a logged OLS model brings in the upper tail of the distribution, can account to some extent for the extreme range of healthcare data, and focuses solely on positive values. However, the logged OLS also requires a smearing retransformation, which could cause bias in the presence of heteroskedasticity (Buntin & Zaslavsky, Reference Buntin and Zaslavsky2004). We recommend researchers apply a histogram of expenditure data and run Park tests to determine which distribution is best suited for the data, as this choice has varied in the literature and based on cost source.

The choice of variables to control depends on whether one applies the instrumental variables or typical associative approach but typically includes education, smoking status, marital status, geographical region, insurance status, and age, which is often modeled nonlinearly. All costs should be inflated to the most recent medical care component of the Consumer Price Index (CPI) and discounted to cohere with previous estimates.

3.7 Limitations of the 2PM for healthcare costs

While it remains the most widely used method to account for healthcare costs, the 2PM has several methodological shortcomings. Firstly, as noted by Deb and Trivedi (Reference Deb and Trivedi2002), the 2PM method of dividing between non-users and users of the healthcare system poorly reflects the actual functioning of healthcare usage over time (Deb & Trivedi, Reference Deb and Trivedi2002). Instead, a clearer division could be based on “frequent” and “infrequent” users of healthcare. This suggests that the 2PM’s estimates are contingent partly on the length of time of a given episode of disease the dataset covers. Additionally, while the 2PM will produce consistent estimates due to being fit to the empirical distribution, its reliance on researchers to specify a parametric probability distribution could lead to misspecification (Mullahy, Reference Mullahy1998).

4. Data requirements

4.1 Criteria for datasets in the age-related weight gain curve

Unfortunately, the lack of a recent, robust estimate of age-related weight gain stems primarily from the limitations of available longitudinal studies in America. Ideally, this longitudinal dataset would be nationally representative, recent, and cover subjects from early adolescence until their deaths. Because such a dataset simply does not exist in this country, we created a list of criteria to determine whether a dataset deserves inclusion in an age-related weight gain curve despite its shortcomings. Most importantly, the dataset should cover over 10 years of subjects’ lives, be relatively recent, have sufficient follow-up and low attrition rates, have a short time between observations, objectively measure height and weight, and be nationally representative. There are a variety of potential datasets suited to the task, including the Framingham Heart Study (FHS), CARDIA, the Health and Retirement Study (HRS), the Medicare Current Beneficiary Survey (MCBS), and the ARIC.

4.2 The Framingham Heart Study

Perhaps the most famous of the five major longitudinal population health studies in the USA, the Framingham Heart Study began with a predominately white cohort in 1948 in Framingham, Massachusetts and continues to this day (Splansky et al., Reference Splansky, Corey, Yang, Atwood, Cupples, Benjamin and D’Agostino2007). Despite biennial observations, FHS has several drawbacks, including covering too few minority subjects, representing only one city, and having recent observations only for middle to older aged subjects. Despite these limitations, FHS provides a robust dataset with over 5000 subjects even in the initial cohort well-suited for modeling BMI transitions across all age groups (Oster et al., Reference Oster, Thompson, Edelsberg, Bird and Colditz1999).

4.3 Coronary artery disease risk in young adults

In an attempt to create a more representative sample, CARDIA, which began in 1985 and ended in 2005 with its fifth and final examination, observes the progression of coronary artery disease in four population centers, including Birmingham, Chicago, Minneapolis, and Oakland (Friedman et al., Reference Friedman, Cutter, Donahue, Hughes, Hulley, Jacobs, Liu and Savage1988). The study enrolled over 5000 Black and white men and women from a variety of regional and sociodemographic situations, with 72 % of the group remaining in the study until 2005. As one of the few nationally representative datasets available, the inclusion of CARDIA in an age-related weight gain curve estimation is practically obligatory. CARDIA focuses predominately on the time period after 18 years of age until middle age.

4.4 Health and Retirement Study (HRS)

The HRS is a longitudinal panel study of over 37,000 Americans from 23,000 households aged 50 and older conducted biannually since 1992 that contains a range of information on health insurance, health, employment, genetic data, and Medicare cost files (Sonnega et al., Reference Sonnega, Faul, Ofstedal, Langa, John and Weir2014). It provides objectively measured height and weight, as well as costs for Medicare recipients, which makes it particularly useful for modeling the BMI trajectory and costs beyond age 65.

4.5 Medicare Current Beneficiary Survey (MCBS)

The MCBS has been conducted for over 25 years and contains longitudinal BMI data for Medicare recipients. It can be linked to Medicare Fee for Service Beneficiary Claims Data to provide detailed information on medical costs and the specifics of an individual’s healthcare utilization. Much like the HRS, the MCBS provides an opportunity to apply cost and BMI data from one dataset but also includes only Americans aged 65 and older (Adler, Reference Adler1994).

4.6 Atherosclerosis Risk in Communities Study (ARIC)

Similar to CARDIA, ARIC takes subjects from four population centers: Minneapolis, Minnesota; Hagerstown, Maryland; Forsyth County, North Carolina; and Jackson Mississippi (Chambless et al., Reference Chambless, Heiss, Folsom, Rosamond, Szklo, Sharrett and Clegg1997). One of the largest longitudinal population health datasets in American history, ARIC, which began in 1987 and has conducted five examinations to date, boasts over 15,000 subjects pulled relatively evenly from each of these centers. Because of its size, recency, and national representation, ARIC should be included in the estimation of the curve as well. The dataset focuses primarily on subjects aged 45–64. Unfortunately, ARIC switched to phone interviews in 1998, at which point weight and height became self-reported and no longer fit the criteria for inclusion outlined above. Still, because ARIC provides a unique age range, over a decade of objectively measured BMI, and consists of a more nationally representative sample than other datasets, it warrants inclusion until the 1998 survey.

4.7 Differential life expectancies dataset

An ideal dataset to measure life expectancy differences between people with obesity and their normal weight peers would be nationally representative, contain the covariates outlined in Equation (6), rely on objectively measured height and weight, and would account for time spent with obesity. No dataset in the USA comes close to meeting all these parameters. However, the NCHS has created linked files between the National Death Index and NHIS interview files detailing all the covariates of interest. This dataset is nationally representative and contains the variables necessary for estimation; however, height and weight are self-reported, and time spent with obesity is not factored into its impact on life expectancy.

4.8 Public use NHIS and corresponding LMFs

Commissioned by the U.S. Census Bureau, the NHIS studies a range of health behaviors and characteristics. Recently, the NCHS has made public use LMFs available through the year 2014 that utilize data from the NHIS to link to files from the comprehensive National Death Index (Lochner et al., Reference Lochner, Hummer, Bartee, Wheatcroft and Cox2008; Center for Health Statistics, 2015). As a result, one can use data from the years 1997 to 2014 in an effort to update recent work on life expectancy that relied primarily on data from the 1990s. Although public use data is subject to data perturbation for anomalous causes of death and location censoring, a predominant focus on only vital status renders these limitations bearable. A researcher’s primary concern in using these data is self-reported height and weight, but these limitations remain a stumbling block for any study of obesity’s effect on life expectancy.

4.9 National lifetables

In order to provide a basis from which to create BMI-specific life expectancies, one should use the official 2015 U.S. Lifetables (the most recent available as of this writing) provided by the NCHS and separated by gender and apply hazard ratios associated with different levels of BMI to these estimates to discern the impact of BMI on life expectancy separately by gender (Arias & Xu, Reference Arias and Xu2015).

The complex survey design and clustering of the NHIS dataset necessitates the use of complex survey design commands, which are intuitive to use in Stata (or similar software packages). However, because the sample design changed in 2006, one must also alter the strata and primary sampling units to maintain statistical independence between these differing sampling plans. Additionally, in order to pool nationally representative data, one must divide the weighting variable by the number of years in the pool. The result is a survey of 500,121 respondents, of whom 61,552 died by the year 2015.

5. Obesity cost model dataset

Because of obesity’s chronic and latent nature, the dataset used for expenditure would ideally include time spent with the disease and corresponding costs (Fallah-Fini et al., Reference Fallah-Fini, Adam, Cheskin, Bartsch and Lee2017). It should also differentiate between third-party payer and out-of-pocket costs because obesity’s cost to society is reflected most accurately by how it strains outside payers in the healthcare system instead of an individual’s budget. The dataset should also be nationally representative and have objectively measured weight and height. While the MEPS uses self-reported height and weight and only reports on subjects for 2 years, it represents the closest dataset to these ideals through the separation of costs by payer, national representation, a large number of subjects, and some of the most detailed cost data in the country and so we recommend the use of this dataset for cost estimates over a lifetime. However, there are better options, including the HRS and MCBS, for studies focusing exclusively on older Americans and researchers focused on Medicare expenses that actually track BMI and costs over time.

Beginning in 1996, MEPS is the most detailed analysis of healthcare cost and utilization among noninstitutionalized Americans presently available. Far and away the most commonly used dataset for U.S. medical expenditure studies, MEPS consists of a 2-year panel design, where subjects report on diseases, health care costs, payment methods, and hundreds of other questions In order to account for the self-reporting of height and weight, we suggest eliminating biologically implausible BMIs, which the WHO defines as subjects with z-BMIs in excess of positive or negative four (World Health Organization, 1995).

Researchers should use the most recent data available and inflate all costs to the most recent year of the Medical Component of the CPI, which presently is 2016 dollars (Consumer Price Index, 2019). Because external medical care costs focus on third-party payers, one must next remove out of pocket costs from total expenditures. MEPS data makes use of a stratified multi-stage probability design to ensure subjects receive weights that make them nationally representative (Lichtenberg, Reference Lichtenberg2001). This cluster design, in which like subjects are grouped into strata, violates the independent and identically distributed observations assumption fundamental to the traditional calculation of standard errors and results in biased statistical inference (Dohoo et al., Reference Dohoo, Martin and Stryhn2003).

To account for this cluster design, we can use Stata’s complex survey design tools, which correct for the unorthodox sampling plan. We suggest using singleton Primary Sampling Units caused by data sub setting centered to the overall sample mean to allow for variance estimation (NHIS – Singleton PSU Reference Information, 2019). Additionally, because data is pooled from 2014 and 2016, one can apply a standard correction recommended by both the CDC and in William G. Cochran’s seminal book Sampling Techniques of dividing the weights by the number of years pooled (Cochran, Reference Cochran1977). This approach also makes intuitive sense because each survey represents the entire nation, so any additional year added would cause the weighted observations to double the population of the country without proper adjustment.

6. An empirical example

We provide an example of the powerful effect of accounting for age-related weight gain on a lifetime estimate of the third-party medical care cost of being a person with obesity as an adolescent. Specifically, we estimate the external cost over the life course of being a person with obesity at age 20 relative to being normal weight at age 20 accounting for age-related weight gain and differential life expectancies. This requires specifying an age-related weight gain, cost, and life expectancy model, which are elucidated below. The full cost of early life obesity depends in part on its effect on future BMI trajectory and mortality and the full scope of external costs over a lifetime is most relevant to policymakers. This approach provides a realistic and practical way to produce an actionable obesity cost estimate.

We apply the CARDIA, ARIC, and FHS datasets to estimate age-related weight gain from ages 20 to 75. It is important to note that due to data sparsity issues, we were unable to separate the obese category from morbidly obese category. This is a significant limitation given the far higher costs borne by the morbidly obese in the literature and researchers should explore ways to rectify this shortcoming. We thus estimate the Markov process represented by Equation (1).

In order to estimate the external medical care cost of obesity, we use the MEPS dataset. Because we rely on publicly available MEPS data for cost estimation, which provides neither sibling nor genetic information, the instrumental variable approaches discussed earlier are not possible. As a result, this estimate faces endogeneity concerns and should be considered only the best possible associational estimate. The first part of the 2PM used to estimate costs is given by

(3) $$ P\left({Y}_i>0\right)=\frac{e^{X^{\prime}\beta }}{1+{e}^{X^{\prime}\beta }}, $$

where Yi is external medical care costs and $ {X}^{\prime}\beta ={\beta}_0+{\sum}_{j = 1}^3{\beta}_j{BMICategories}_i\hskip1.5pt +\hskip2.5pt {\beta}_2{Education}_i\hskip1.5pt +\hskip1.5pt {\beta}_3{Rural}_i\hskip1.5pt +\hskip1.8pt {\beta}_4{Smoker}_i\hskip2.5pt +\hskip1.8pt {\beta}_5{InsurStat}_i + {\sum}_{j = 6}^8{\beta}_j{MaritalStat}_i\hskip1.5pt +\hskip1.5pt {\beta}_9{Region}_i\hskip1.5pt +\hskip1.5pt {\beta}_{10}{Age}_i\hskip1.5pt +\hskip1.5pt {\sum}_{j = 11}^{13}{\beta}_j BMICat\hskip1.5pt \ast \hskip1.5pt {Age}_i\hskip1.5pt +\hskip1.5pt {\beta}_{14}{Age}_i^2\hskip0pt +\hskip1.0pt {\varepsilon}_i $ . BMI categories consist of underweight, normal weight, and overweight (obesity is the excluded category) and marital status consists of single, married, divorced, and widowed. This is a logit model predicting the probability of any medical care expenditure. The second part of the model is a logged OLS specification with a smearing transformation used to estimate medical care costs given costs are greater than zero in (3) and can be represented as

(4) $$ \mathit{\log}\left({Y}_i|{Y}_i>0\right)={X}^{\prime}\beta $$

Both models rely on the same set of covariates described below, which is standard practice for 2PMs.

Lastly, to account for the potential of differential life expectancies by BMI status we use the LMFs provided by the NCHS and the most recent U.S. lifetable. We apply a Cox proportional hazards model run separately by gender and stratified by smoking status,

(5) $$ h(t)={h}_0(t){e}^{\left({Z}^{\prime}\beta \right)} $$

where $ {Z}^{\prime}\beta $ includes age, BMI category, and smoking status as covariates.

It is important to note that this empirical example demonstrates best practices for creating a lifetime associative estimate of obesity’s cost. If one favored better identifiability, desired to focus on Medicare enrollees exclusively, or only emphasized costs at a particular point in time, there are other techniques and datasets available that would provide a more robust estimate unaffected by the Markov assumption or persistent endogeneity.

As Table 1 demonstrates, the external costs from the 2PM are summed after applying a 3 % discount rate until the point at which the subgroup reaches their median life expectancy. After accounting for age-related weight gain and differential life expectancy, a male with obesity at 20 years of age faces $16,091.25 in excess lifetime external medical care costs compared to if they were normal weight ceteris paribus with 95 % confidence intervals derived from nonparametric bootstrapping ranging from $13,987.09 to $18,195.37. An average female with obesity at age 20 faces excess lifetime external costs of $27,181.24 ($22,357.71–$31,801.53) relative to their normal weight peers. Clearly, a reduction in obesity in early life, as well as the maintenance of normal weight status thereafter, could produce substantial cost savings (Tables 2–4).

Table 1 Derivation of total costs after discounting.

Note: 95 % confidence intervals in parentheses.

Abbreviation: BMI, body mass index.

Table 2 Regression results for males.

Standard errors in parentheses.

***p < 0.01.

**p < 0.05.

*p < 0.1.

Table 3 Predicted costs at different ages for normal weight males (before discounting)

***p < 0.01.

**p < 0.05.

*p < 0.1.

Table 4 Predicted costs at different ages for males with obesity (before discounting)

***p < 0.01.

**p < 0.05.

*p < 0.1.

Disease prevention measures, even when cost effective, only rarely result in actual cost savings (Cohen et al., 2008). For instance, the CDC explains a “cost-effective” intervention is generally regarded as any intervention that costs less than $50,000–$100,000 per Quality Adjusted Life Years(QALYs), a measure used to quantify health benefits factoring in life expectancy and subjective life quality, saved. Cost savings, on the other hand, would involve an intervention that actually costs less than the current status quo. Because obesity results in significant costs and disability, interventions targeting it have the opportunity to be either cost-effective or cost-saving.

By assigning a dollar figure to the excess external medical care costs from obesity over a person’s entire life, not just a year snapshot, and accounting for the biological realities of weight gain and premature death from obesity, we can discern when interventions are cost saving, cost-effective, or budget neutral. Potential interventions that prove efficacious and provide cost saving to the US’s already overburdened healthcare system should be given priority, particularly given growing dissatisfaction with the increasing costs of healthcare (Hempstead 2012). However, there are numerous “indirect social costs” of obesity not discussed in this analysis. Several studies have attempted to quantify these, with little agreement on methods or results.

There exist a variety of social costs beyond medical care from obesity, including increased disability, excess mortality, and absenteeism and presenteeism at work causing a decrease in productivity. Studies on these “indirect” costs of obesity generally exhibit even greater heterogeneity than in the medical care cost literature, with relatively few longitudinal or well-identified studies on the subject that produce an enormous range of estimates (Tremmel et al., Reference Tremmel, Gerdtham, Nilsson and Saha2017). These shortcomings exist in part from data limitations; however, there are also numerous methodological differences between the studies, which use anything from a population attributable fraction approach to instrumental variables to microsimulation models (Goettler et al., Reference Goettler, Grosse and Sonntag2017). Additionally, only six studies in the current literature cover over 1 year of time, which means the long-term indirect costs of obesity remain largely unexplored. The range of costs in the literature is so large as to be uninformative, as absenteeism, for example, has costs ranging from $8 to $1,586 annually and even significant divergence in the amount of excess mortality and associated costs (Neovius et al., Reference Neovius, Rehnberg, Rasmussen and Neovius2012; Goettler et al., Reference Goettler, Grosse and Sonntag2017). While these costs are not borne directly by the healthcare system, they are important in understanding the full extent of externalities created by obesity.

7. Conclusion

As more comprehensive and recent datasets become available for the estimation of age-related weight gain and obesity’s lifetime costs, researchers should be able to use this article as a guide on how to create these models and what data limitations to consider. While currently only a few articles on obesity’s costs make any attempt to account for age-related weight gain, we hope that this explication of the first-order Markov model will inspire further work in a field in desperate need of definitive answers.

As pre-specification and transparency become more common in the selection and implementation of econometric models, economists will be better able to replicate and decipher the robustness of estimates found in other work in a process that should increasingly mirror the one that has existed for decades in the physical sciences. As a result, methodological articles such as this one will provide a crucial step in the creation of complete transparency between reviewers, researchers, and stakeholders in the medical field. Because of the record high prevalence of obesity and overweight in the USA, as well as sharp increasing trends in many lower and middle-income countries, understanding weight gain’s evolution and the underpinning of the obesity epidemic remains as important as ever. Estimating obesity’s impact on costs, and the assumptions on which this estimate was derived, is a crucial first step in understanding how it burdens the healthcare system and in the design of efficacious – and cost-saving – solutions.

Financial Support

Robert Schell received NIH funding from the NIA as a T32 training grant: T32-AG000246.

Disclosure

The authors declare no conflict of interest.

References

Abdelaal, Mahmoud, Roux, Carel W le, and Docherty, Neil G. 2017. “Morbidity and Mortality Associated with Obesity.” Annals of Translational Medicine, 5(7): 161. https://doi.org/10.21037/atm.2017.03.107.CrossRefGoogle ScholarPubMed
Adler, G. S. 1994. “A Profile of the Medicare Current Beneficiary Survey.” Health Care Financing Review, 15(4): 153163.Google ScholarPubMed
Arias, Elizabeth, and Xu, Jiaquan. 2015. “NVSR67 No7 United States Life Tables, 2015.”Google Scholar
Bell, Janice F., Zimmerman, Frederick J., Arterburn, David E., and Maciejewski, Matthew L.. 2011. “Health-Care Expenditures of Overweight and Obese Males and Females in the Medical Expenditures Panel Survey by Age Cohort.” Obesity, 19(1): 228232. https://doi.org/10.1038/oby.2010.104.CrossRefGoogle ScholarPubMed
Bhaskaran, Krishnan, Douglas, Ian, Forbes, Harriet, dos-Santos-Silva, Isabel, Leon, David A, and Smeeth, Liam. 2014. “Body-Mass Index and Risk of 22 Specific Cancers: A Population-Based Cohort Study of 5·24. Million UK Adults.” The Lancet, 384(9945): 755765. https://doi.org/10.1016/S0140-6736(14)60892-8.CrossRefGoogle ScholarPubMed
Bhattacharya, Jay, and Sood, Neeraj. 2011. “Who Pays for Obesity?Journal of Economic Perspectives, 25(1): 139158. https://doi.org/10.1257/jep.25.1.139.CrossRefGoogle ScholarPubMed
Bradburn, M. J., Clark, T. G., Love, S. B., and Altman, D. G.. 2003. “Survival Analysis Part III: Multivariate Data Analysis – Choosing a Model and Assessing Its Adequacy and Fit.” British Journal of Cancer, 89: 605611. https://doi.org/10.1038/sj.bjc.6601120.CrossRefGoogle Scholar
Buntin, Melinda Beeuwkes, and Zaslavsky, Alan M.. 2004. “Too Much Ado about Two-Part Models and Transformation?: Comparing Methods of Modeling Medicare Expenditures.” Journal of Health Economics, 23(3): 525542. https://doi.org/10.1016/J.JHEALECO.2003.10.005.CrossRefGoogle ScholarPubMed
Burke, Gregory L., Bild, Diane E., Hilner, Joan E., Folsom, Aaron R., Wagenknecht, Lynne E., and Sidney, Stephen. 1996. “Differences in Weight Gain in Relation to Race, Gender, Age and Education in Young Adults: The CARDIA Study.” Ethnicity & Health, 1(4): 327335. https://doi.org/10.1080/13557858.1996.9961802.CrossRefGoogle ScholarPubMed
Burton, Wayne N., Chen, Chin-Yu, Schultz, Alyssa B., and Edington, Dee W.. The economic costsassociated with body mass index in a workplace. Journal of Occupational & Environmental Medicine, 40(9): 786–92.CrossRefGoogle Scholar
Cawley, John, and Meyerhoefer, Chad. 2011. “The Medical Care Costs of Obesity: An Instrumental Variables Approach.” Journal of Health Economics, 31: 219230. https://doi.org/10.1016/j.jhealeco.2011.10.003.CrossRefGoogle ScholarPubMed
Center for Health Statistics, National. 2015. “Public-Use 2015 Linked Mortality File Description.”Google Scholar
Chambless, L. E., Heiss, G., Folsom, A. R., Rosamond, W., Szklo, M., Sharrett, A. R., and Clegg, L. X.. 1997. “Association of Coronary Heart Disease Incidence with Carotid Arterial Wall Thickness and Major Risk Factors: The Atherosclerosis Risk in Communities (ARIC) Study, 1987–1993.” American Journal of Epidemiology, 146(6): 483494. https://doi.org/10.1093/oxfordjournals.aje.a009302.CrossRefGoogle ScholarPubMed
Clark, T. G., Bradburn, M. J., Love, S. B., and Altman, D. G.. 2003. “Survival Analysis Part I: Basic Concepts and First Analyses.” British Journal of Cancer, 89: 232238. https://doi.org/10.1038/sj.bjc.6601118.CrossRefGoogle ScholarPubMed
Cochran, William G. 1977. Sampling Techniques. 3rd ed. New York: Wiley.Google Scholar
Consumer Price Index. 2019. “CPI Databases : U.S. Bureau of Labor Statistics.”Google Scholar
Daviglus, Martha L., Liu, Kiang, Yan, Lijing L., Pirzada, Amber, Manheim, Larry, Manning, Willard, Garside, Daniel B., et al. 2004. “Relation of Body Mass Index in Young Adulthood and Middle Age to Medicare Expenditures in Older Age.” JAMA, 292(22): 2743. https://doi.org/10.1001/jama.292.22.2743.CrossRefGoogle ScholarPubMed
Deb, Partha, and Trivedi, Pravin K.. 2002. “The Structure of Demand for Health Care: Latent Class versus Two-Part Models.” Journal of Health Economics, 21(4): 601625. https://doi.org/10.1016/S0167-6296(02)00008-5.CrossRefGoogle ScholarPubMed
Dixon, Padraig, Hollingworth, William, Harrison, Sean, Davies, Neil M., and Smith, George Davey. 2020. “Mendelian Randomization Analysis of the Causal Effect of Adiposity on Hospital Costs.” Journal of Health Economics, 70: 102300. https://doi.org/10.1016/j.jhealeco.2020.102300.CrossRefGoogle ScholarPubMed
Dohoo, Ian R., Martin, Wayne S., and Stryhn, Henrik. 2003. Veterinary Epidemiologic Research. 1st ed. Charlottetown: AVC Inc.Google Scholar
Fallah-Fini, Saeideh, Adam, Atif, Cheskin, Lawrence J., Bartsch, Sarah M., and Lee, Bruce Y.. 2017. “The Additional Costs and Health Effects of a Patient Having Overweight or Obesity: A Computational Model.” Obesity, 25(10): 18091815. https://doi.org/10.1002/oby.21965.CrossRefGoogle ScholarPubMed
Fernandes, Meenakshi Maria. 2010. “Evaluating the Impacts of School Nutrition and Physical Activity Policies on Child Health.”Google Scholar
Finkelstein, Eric A., Brown, Derek S., Wrage, Lisa A., Allaire, Benjamin T., and Hoerger, Thomas J.. 2010. “Individual and Aggregate Years-of-Life-Lost Associated With Overweight and Obesity.” Obesity, 18 (2): 333339. https://doi.org/10.1038/oby.2009.253.CrossRefGoogle ScholarPubMed
Finkelstein, Eric A., Fiebelkorn, Ian C., and Wang, Guijing. 2003. “National Medical Spending Attributable to Overweight and Obesity: How Much, and Who’s Paying?Health Affairs (Project Hope), Suppl Web Exclusives (December). https://doi.org/10.1377/hlthaff.w3.219.CrossRefGoogle ScholarPubMed
Finkelstein, Eric A., Trogdon, Justin G., Brown, Derek S., Allaire, Benjamin T., Dellea, Pam S., and Kamal-Bahl, Sachin J.. 2008. “The Lifetime Medical Cost Burden of Overweight and Obesity: Implications for Obesity Prevention.” Obesity, 16(8): 18431848. https://doi.org/10.1038/oby.2008.290.CrossRefGoogle ScholarPubMed
Flegal, Katherine M., Graubard, Barry I., Williamson, David F., and Gail, Mitchell H.. 2005. “Excess Deaths Associated With Underweight, Overweight, and Obesity.” JAMA, 293(15): 1861. https://doi.org/10.1001/jama.293.15.1861.CrossRefGoogle ScholarPubMed
Fontaine, Kevin R., Redden, David T., Wang, Chenxi, Westfall, Andrew O., and Allison, David B.. 2003. “Years of Life Lost Due to Obesity.” JAMA, 289(2): 187. https://doi.org/10.1001/jama.289.2.187.CrossRefGoogle ScholarPubMed
Fosler-Lussier, Eric. 1998. “Markov Models and Hidden Markov Models: A Brief Tutorial.”Google Scholar
Friedman, Gary D., Cutter, Gary R., Donahue, Richard P., Hughes, Glenn H., Hulley, Stephen B., Jacobs, David R., Liu, Kiang, and Savage, Peter J.. 1988. “Cardia: Study Design, Recruitment, and Some Characteristics of the Examined Subjects.” Journal of Clinical Epidemiology, 41(11): 11051116. https://doi.org/10.1016/0895-4356(88)90080-7.CrossRefGoogle ScholarPubMed
Goettler, Andrea, Grosse, Anna, and Sonntag, Diana. 2017. “Productivity Loss Due to Overweight and Obesity: A Systematic Review of Indirect Costs.” BMJ Open, 7(10): e014632. https://doi.org/10.1136/bmjopen-2016-014632.CrossRefGoogle ScholarPubMed
Gokee LaRose, J., Tate, D. F., Gorin, A. A., and Wing, R. R.. 2010. “Preventing Weight Gain in Young Adults: A Randomized Controlled Pilot Study.” American Journal of Preventive Medicine, 39(1): 6368. https://doi.org/10.1016/j.amepre.2010.03.011.CrossRefGoogle ScholarPubMed
Grambsch, Patricia M, and Therneau, Terry M.. 1994. “Proportional Hazards Tests and Diagnostics Based on Weighted Residuals.” Biometrika, 81(3): 515526. https://doi.org/10.1093/biomet/81.3.515.CrossRefGoogle Scholar
Hall, Kevin D. 2007. “Body Fat and Fat-Free Mass Inter-Relationships: Forbes’s Theory Revisited.” British Journal of Nutrition, 97(6): 10591063. https://doi.org/10.1017/S0007114507691946.CrossRefGoogle ScholarPubMed
Hamilton, J.D. 1994. Time Series Analysis. Princeton, NJ: Princeton University Press.CrossRefGoogle Scholar
Heo, Moonseong, Faith, Myles S., Mott, John W., Gorman, Bernard S., Redden, David T., and Allison, David B.. 2003. “Hierarchical Linear Models for the Development of Growth Curves: An Example with Body Mass Index in Overweight/Obese Adults.” Statistics in Medicine, 22(11): 19111942. https://doi.org/10.1002/sim.1218.CrossRefGoogle ScholarPubMed
Heymsfield, Steven, Aronne, Louis J, Eneli, Ihuoma, Kumar, Rekha, Michalsky, Marc, Walker, Elizaveta, Wolfe, Bruce M, Woolford, Susan J, and Yanovski, Susan. 2018. “Clinical Perspectives on Obesity Treatment: Challenges, Gaps, and Promising Opportunities.” NAM Perspectives, 8(9)480486. https://doi.org/10.31478/201809b.CrossRefGoogle Scholar
Thorpe, Kenneth E., Florence, Curtis S., Howard, David H., and Joski, Peter. 2004. “Trends: The Impact Of Obesity On Rising Medical Spending.” Health Affairs, 543551. https://doi.org/10.1377/hlthaff.w4.480.Google ScholarPubMed
Kurz, Christoph F., and Laxy, Michael. 2020. “Application of Mendelian Randomization to Investigate the Association of Body Mass Index with Health Care Costs.” Medical Decision Making, 40(2): 156169. https://doi.org/10.1177/0272989X20905809.CrossRefGoogle ScholarPubMed
Lay, David C., Lay, Steven R., and McDonald, Judith. 2016. Linear Algebra and Its Applications. 5th ed. Boston: Pearson.Google Scholar
Lichtenberg, Frank R. 2001. “Are The Benefits Of Newer Drugs Worth Their Cost? Evidence From The 1996 MEPS.” Health Affairs, 20(5): 241251. https://doi.org/10.1377/hlthaff.20.5.241.CrossRefGoogle ScholarPubMed
Lochner, Kimberly, Hummer, Robert A, Bartee, Stephanie, Wheatcroft, Gloria, and Cox, Christine. 2008. “The Public-Use National Health Interview Survey Linked Mortality Files: Methods of Reidentification Risk Avoidance and Comparative Analysis.” American Journal of Epidemiology, 168(3): 336344. https://doi.org/10.1093/aje/kwn123.CrossRefGoogle ScholarPubMed
Ma, Sai, and Frick, Kevin D.. 2011. “A Simulation of Affordability and Effectiveness of Childhood Obesity Interventions.” Academic Pediatrics, 11(4): 342350. https://doi.org/10.1016/J.ACAP.2011.04.005.CrossRefGoogle ScholarPubMed
Manning, Willard G, and Mullahy, John. 2001. “Estimating Log Models: To Transform or Not to Transform?Journal of Health Economics, 20(4): 461494. https://doi.org/10.1016/S0167-6296(01)00086-8.CrossRefGoogle ScholarPubMed
Massachusetts Medical Society, Majeed, F. A., and The GBD 2015 Obesity Collaborators. 2017. “Health Effects of Overweight and Obesity in 195 Countries over 25 Years.” The New England Journal of Medicine, 377(1): 1327.Google Scholar
Moolgavkar, Suresh H., Chang, Ellen T., Watson, Heather N., and Lau, Edmund C.. 2018. “An Assessment of the Cox Proportional Hazards Regression Model for Epidemiologic Studies.” Risk Analysis, 38(4): 777794. https://doi.org/10.1111/risa.12865.CrossRefGoogle ScholarPubMed
Mullahy, John. 1998. “Much Ado about Two: Reconsidering Retransformation and the Two-Part Model in Health Econometrics.” Journal of Health Economics, 17(3): 247281. https://doi.org/10.1016/S0167-6296(98)00030-7.CrossRefGoogle ScholarPubMed
Must, Aviva, and Strauss, R. S.. 1999. “Risks and Consequences of Childhood and Adolescent Obesity.” International Journal of Obesity, 23(S2): S211. https://doi.org/10.1038/sj.ijo.0800852.CrossRefGoogle ScholarPubMed
Nelson, Melissa C., Story, Mary, Larson, Nicole I., Neumark-Sztainer, Dianne, and Lytle, Leslie A.. 2008. “Emerging Adulthood and College-Aged Youth: An Overlooked Age for Weight-Related Behavior Change.” Obesity, 16(10): 22052211. https://doi.org/10.1038/oby.2008.365.CrossRefGoogle ScholarPubMed
Neovius, Kristian, Rehnberg, Clas, Rasmussen, Finn, and Neovius, Martin. 2012. “Lifetime Productivity Losses Associated with Obesity Status in Early Adulthood.” Applied Health Economics and Health Policy, 10(5): 309317. https://doi.org/10.1007/bf03261865.CrossRefGoogle ScholarPubMed
NHIS – Singleton PSU Reference Information.” 2019.Google Scholar
Oster, G., Thompson, D., Edelsberg, J., Bird, A. P., and Colditz, G. A.. 1999. “Lifetime Health and Economic Benefits of Weight Loss among Obese Persons.” American Journal of Public Health, 89(10): 15361542. https://doi.org/10.2105/AJPH.89.10.1536.CrossRefGoogle ScholarPubMed
World Health Organization . 1995. Physical Status the Use and Interpretation of Anthropometry: Report of a WHO Expert Committee. Geneva: World Health Organization.Google Scholar
Prentice, A. M., and Jebb, S. A.. 2001. “Beyond Body Mass Index.” Obesity Reviews, 2(3): 141147. https://doi.org/10.1046/j.1467-789x.2001.00031.x.CrossRefGoogle ScholarPubMed
Schell, Robert C., Just, David R., and Levitsky, David A.. 2020. “Predicted Lifetime Third‐Party Costs of Obesity for Black and White Adolescents with Race‐Specific Age‐Related Weight Gain.” Obesity, 28(2): 397403. https://doi.org/10.1002/oby.22690.CrossRefGoogle ScholarPubMed
Sonnega, Amanda, Faul, Jessica D., Ofstedal, Mary Beth, Langa, Kenneth M., John, W. R. Phillips, and Weir, David R.. 2014. “Cohort Profile: The Health and Retirement Study (HRS).” International Journal of Epidemiology, 43(2): 576585. https://doi.org/10.1093/ije/dyu067.CrossRefGoogle Scholar
Sonntag, D., Ali, S., Lehnert, T., Konnopka, A., Riedel-Heller, S., and König, H.-H.. 2015. “Estimating the Lifetime Cost of Childhood Obesity in Germany: Results of a Markov Model.” Pediatric Obesity, 10(6): 416422. https://doi.org/10.1111/ijpo.278.CrossRefGoogle ScholarPubMed
Splansky, Greta Lee, Corey, Diane, Yang, Qiong, Atwood, Larry D., Cupples, L. Adrienne, Benjamin, Emelia J., D’Agostino, Ralph B., et al. 2007. “The Third Generation Cohort of the National Heart, Lung, and Blood Institute’s Framingham Heart Study: Design, Recruitment, and Initial Examination.” American Journal of Epidemiology, 165(11): 13281335. https://doi.org/10.1093/aje/kwm021.CrossRefGoogle ScholarPubMed
Stensrud, Mats J., and Hernán, Miguel A.. 2020. “Why Test for Proportional Hazards?JAMA – Journal of the American Medical Association, 323(14): 14011402. https://doi.org/10.1001/jama.2020.1267.CrossRefGoogle ScholarPubMed
Thompson, David, Edelsberg, John, Colditz, Graham A., Bird, Amy P., and Oster, Gerry. 1999. “Lifetime Health and Economic Consequences of Obesity.” Archives of Internal Medicine, 159(18): 2177. https://doi.org/10.1001/archinte.159.18.2177.CrossRefGoogle ScholarPubMed
Tremmel, Maximilian, Gerdtham, Ulf-G., Nilsson, Peter, and Saha, Sanjib. 2017. “Economic Burden of Obesity: A Systematic Literature Review.” International Journal of Environmental Research and Public Health, 14(4): 435. https://doi.org/10.3390/ijerph14040435.CrossRefGoogle ScholarPubMed
Trogdon, Justin G., Finkelstein, Eric A., Feagan, Charles W., and Cohen, Joel W.. 2012. “State- and Payer-Specific Estimates of Annual Medical Expenditures Attributable to Obesity.” Obesity, 20(1): 214220. https://doi.org/10.1038/oby.2011.169.CrossRefGoogle ScholarPubMed
Tucker, Daniel M. D., Palmer, Andrew J., Valentine, William J., Roze, Stéphane, and Ray, Joshua A.. 2006. “Counting the Costs of Overweight and Obesity: Modeling Clinical and Cost Outcomes.” Current Medical Research and Opinion, 22(3): 575586. https://doi.org/10.1185/030079906X96227.CrossRefGoogle ScholarPubMed
Wang, Li Y., Denniston, Maxine, Lee, Sarah, Galuska, Deborah, and Lowry, Richard. 2010. “Long-Term Health and Economic Impact of Preventing and Reducing Overweight and Obesity in Adolescence.” Journal of Adolescent Health, 46(5): 467473. https://doi.org/10.1016/J.JADOHEALTH.2009.11.204.CrossRefGoogle ScholarPubMed
Wang, Y. Claire, Pamplin, John, Long, Michael W., Ward, Zachary J., Gortmaker, Steven L., and Andreyeva, Tatiana. 2015. “Severe Obesity In Adults Cost State Medicaid Programs Nearly $8 Billion In 2013.” Health Affairs, 34(11): 19231931. https://doi.org/10.1377/hlthaff.2015.0633.CrossRefGoogle ScholarPubMed
Wee, Christina C., Phillips, Russell S., Legedza, Anna T. R., Davis, Roger B., Soukup, Jane R., Colditz, Graham A., and Hamel, Mary Beth. 2005. “Health Care Expenditures Associated with Overweight and Obesity among US Adults: Importance of Age and Race.” American Journal of Public Health, 95(1): 159165. https://doi.org/10.2105/AJPH.2003.027946.CrossRefGoogle Scholar
Williams, P. T., and Wood, P. D.. 2006. “The Effects of Changing Exercise Levels on Weight and Age-Related Weight Gain.” International Journal of Obesity, 30(3)543551. https://doi.org/10.1038/sj.ijo.0803172.CrossRefGoogle ScholarPubMed
Yang, Zhou, and Hall, Allyson G.. 2007. “The Financial Burden of Overweight and Obesity among Elderly Americans: The Dynamics of Weight, Longevity, and Health Care Cost.” Health Services Research, 43(3): 849868. https://doi.org/10.1111/j.1475-6773.2007.00801.x.CrossRefGoogle Scholar
Figure 0

Table 1 Derivation of total costs after discounting.

Figure 1

Table 2 Regression results for males.

Figure 2

Table 3 Predicted costs at different ages for normal weight males (before discounting)

Figure 3

Table 4 Predicted costs at different ages for males with obesity (before discounting)