1. Introduction
Why do economies transition from millennia of near-zero income growth to modern income growth rates? Leading theories of long-run growth attempt to understand development through one of two mechanisms. Literature following Becker et al. (Reference Becker, Murphy and Tamura1990) and Galor and Weil (Reference Galor and Weil2000) theorizes that the central mechanism is the substitution of child quantity to child quality, and jointly explains the growth transition and the demographic transition. Simultaneously, literature following Hansen and Prescott (Reference Hansen and Prescott2002) and Lucas (Reference Lucas2004) theorizes that the central mechanism is structural transformation and jointly explains the growth transition and urbanization.
But these mechanisms are not substitutes. The incentives for quantity-quality substitution differ between urban and rural areas, and structural transformation alone cannot explain the rapid acceleration of economic growth. I propose a unifying theory which features both mechanisms and endogenously reproduces the timing and magnitude of the three transitions: growth, urbanization, and demographics. Only by considering these transitions jointly can this theory predict the following observations: a declining urban-rural wage gap, a declining rural-urban family size ratio, and that early urbanization slows development. The third prediction, that urbanization is not a panacea for growth, is a result of high preindustrial urban child mortality and is novel in this literature.
The association of early urbanization with lower modern incomes is novel to applied theories of very long-run growth and demographics,Footnote 1 and is closely related to the empirical literature's concept of the “reversal of fortune”, whereby high preindustrial income is associated with slower growth. Indeed, Acemoglu et al. (Reference Acemoglu, Johnson and Robinson2002) specifically use urbanization as a proxy for early income and show it is negatively correlated with modern income levels for former colonies. Their explanation for this relationship is the colonial transmission of institutions. This contrasts with the present paper in two ways. First, I am concerned with the relationship between growth and urbanization directly, not just as a proxy for income. Second, the explanation in this paper is that high child mortality in early urban centers disincentivizes human capital growth. This is an independently important effect, which I demonstrate in section 2.2, showing that early urbanization is associated with delayed growth, even when controlling for the alternative explanations of colonial history and geographic factors [e.g., Diamond (Reference Diamond1998), Acemoglu et al. (Reference Acemoglu, Johnson and Robinson2005) or Nunn and Qian (Reference Nunn and Qian2011)]. In the long run, urbanization is beneficial for growth; a large literature supports this, and it is true in this paper's model as well. However, the factors that cause preindustrial civilizations to be more urbanized are associated with delayed transitions to modern growth.
The model economy has two sectors.Footnote 2 Human capital growth drives production to shift out of the rural sector, which has diminishing returns to scale.Footnote 3 The higher returns to scale of the urban sector increase the income growth associated with any rate of human capital growth.
Households choose how much time to work in the market, how much time to spend raising children, and how much time to spend investing in their children's human capital, in the spirit of Becker (Reference Becker and Dwyer1960). As the child mortality rate improves, the household can afford a higher quantity and quality of children. Increasing the number of children increases the cost of investing a unit of human capital in each child [as in Becker and Lewis (Reference Becker and Lewis1973)], so parents reduce fertility and spend more time on human capital investment. At high mortality levels, households have more net children as they become less costly. But as child mortality falls further, the income effect dominates the substitution effect, so households shift from child quantity to child quality.Footnote 4 As families choose fewer children and more investment per child, per capita human capital grows faster and faster. Per capita income growth rises from near-stagnation to modern levels.
Urban households suffer higher child mortality than rural households, so the relative wage in urban areas is high because households must be compensated for moving to the deadly city.Footnote 5 As human capital grows, increased knowledge reduces mortality. Declines in the difference between urban and rural mortality reduce the wage premium needed to induce households to live in an urban area, enabling further urbanization.
A large branch of the unified growth literature considers the quantity-quality trade-off to be the central mechanism behind the growth transition. The motivation for this hypothesis is generally the correlation between the growth transition and the demographic transition. Becker et al. (Reference Becker, Murphy and Tamura1990) first analyze the quantity-quality trade-off in the context of an endogenous growth model; Lucas (Reference Lucas2002) considers introducing land as a fixed factor, allowing for either a Malthusian or modern growth outcome. Galor and Weil (Reference Galor and Weil2000) model fertility increasing as workers escape their subsistence consumption constraint and work fewer hours, but who then substitute to quality as returns to education rise. Galor and Moav (Reference Galor and Moav2004) introduce physical capital to the framework and study inequality during the transition. Doepke (Reference Doepke2004) consider a two-sector model with a child quantity-quality decision, where education subsidies and especially child labor regulation can influence a country's transition timing. Empirical evidence supports the quantity-quality substitution during industrialization, for example in Prussia [Becker et al. (Reference Becker, Cinnirella and Woessmann2010)], in the American South [Bleakley and Lange (Reference Bleakley and Lange2009)], and across the developing world in the 20th century [Chatterjee and Vogl (Reference Chatterjee and Vogl2018)].
The quantity-quality decision is governed by the return to human capital, which changes over the transition period. Some authors hypothesize that this return changes due to level effects in technology or growth. For example, Galor and Moav (Reference Galor and Moav2002) assume a complementarity between education and the technological growth rate, while Doepke (Reference Doepke2004) assumes that an increase in the level of skill-intensive technology increases the return. Other hypotheses include capital-skill complementarity; Fernandez-Villaverde (Reference Fernandez-Villaverde2001) finds the capital-specific technological change can explain more than 50% of England's growth and demographic transitions.
I assume a different channel: declining child mortality increases the return to human capital investment, driving the quantity-quality substitution. This joins a growing literature arguing that child mortality improvements are central to the transition to modern growth. The exact mechanism—child mortality's effect on the return to human capital—differs from other papers in this literature. For example, Kalemli-Ozcan (Reference Kalemli-Ozcan2002) and Kalemli-Ozcan (Reference Kalemli-Ozcan2008) show that reductions in child mortality induce substitution from child quantity to quality by reducing the precautionary motive to have many children. Bhattacharya and Chakraborty (Reference Bhattacharya and Chakraborty2017) find that mortality improvements can speed the adoption of modern contraception, which is complementary to substituting towards child quality. Ehrlich and Lui (Reference Ehrlich and Lui1991) argue that reductions in mortality raise the incentive for parents to invest in their children's human capital, as longevity improvements raise the value of future old-age support from their offspring. Other papers suggesting that child mortality improvements drive fertility declines include Eckstein et al. (Reference Eckstein, Mira and Wolpin1999), Lagerlof (Reference Lagerlof2003), Hazan and Zoabi (Reference Hazan and Zoabi2006), and Bhattacharya and Chakraborty (Reference Bhattacharya and Chakraborty2012).Footnote 6
The hypothesis that child mortality is fundamental to the growth transition is not without controversy. [Galor (Reference Galor2011), Chapter 4] rejects this channel on theoretical grounds. Using a static model of consumption and fertility choice, he shows that declines in child mortality rates should not affect fertility and will just increase surviving children, if the household has balanced growth compatible preferences. Doepke (Reference Doepke2005) and Strulik (Reference Strulik2017) reach a similar conclusion. However, the model described in section 3 differs from this conclusion when preferences are dynastic, and households invest in each child's human capital, even with balanced growth compatibility. Galor also rejects the child mortality channel on empirical grounds, given that the mortality in England declined significantly during the 18th century, over a hundred years prior to the demographic transition, without an associated decline in fertility. This is true of the crude death rate, but the relevant measure is the child mortality rate, which Wrigley and Schofield (Reference Wrigley and Schofield1983) document as not declining significantly over the same period (Figure 1).
The second set of theories focus on structural transformation as the cause of the growth transition. The motivation for this hypothesis is generally the correlation between the growth transition and urbanization. Hansen and Prescott (Reference Hansen and Prescott2002) consider an economy where only one sector uses the land as an input and is perfectly substitutable with a constant returns sector. Given the exogenous population and technological growth, the economy transitions from a Malthusian regime where only the land-intensive sector operates, to a modern regime where both operate. Lucas (Reference Lucas2004) examines an endogenous growth model in which urban locations have increasing returns to scale in human capital as workers exchange ideas and learn from each other. Growth drives structural transformation out of agriculture due to the presence of a fixed factor, land.Footnote 7 Agriculture makes up the majority of employment in preindustrial Europe [Allen (Reference Allen2000)] so structural transformation out of agriculture leads to urbanization if agriculture is not entirely substituted for rural non-agricultural industries. Economic growth can lead to both technological or preference-driven structural transformation, but the formal model in this paper considers technological structural transformation, motivated by evidence from Kuznets (Reference Kuznets1966), Maddison (Reference Maddison, Leveson and Wheeler1980), and Baumol et al. (Reference Baumol, Blackman and Wolff1985), among many others.Footnote 8
The intersection of these two broad growth literatures—quantity-quality substitution and structural transformation—is limited. The present paper argues that the intersection is important for understanding long-run transitions and the interaction between the two forces generate effects that cannot be observed when considered independently. Few papers populate this intersection, but in an important related paper, Baudin and Stetler (Reference Baudin and Stetler2018) also consider a growth model with urban and rural differences in demographic decisions; they use the framework to show that migration costs can slow an economy's transition and increase urban-rural inequality.
The remainder of this paper is organized as follows: section 2 describes the empirical patterns, section 3 describes the model environment, section 4 defines equilibrium and characterizes several properties, section 5 outlines the calibration procedure and simulation results, section 6 considers the model under alternative calibrations and examines the empirical implications, and section 7 concludes.
2. Empirical patterns
Figure 1 plots the three transitions in England from 1295 CE. Before the industrial revolution, real income growth is consistently <1%. The urban share of people is <10%. Fertility and mortality rates are high. Then, since 1800, all of these series transition to modern values. This joint transition is an empirical regularity: among large countries with a thousand years of urbanization and income estimates, there is no evidence of a sustained transition for income growth, urbanization, fertility, or mortality before 1800.Footnote 9
Moreover, these transitions occur around the same time within a country. This is well-known, but to illustrate, I calculate the first year that each country surpasses a benchmark level for each series: (a) 1% annual income growth, (b) 50% urban, (c) total fertility rate below 3, and (d) under-five child mortality below 5%. Table 1 reports the correlation table for these transition years.Footnote 10 Countries that experience an early growth transition also tend to urbanize early and have fertility and mortality fall early. This correlation is also observable in the current cross-section. Table 2 reports the percentage of countries surpassing the urbanization and demographic benchmarks for two income groups. Countries with 2012 GDP per capita of at least $ 10, 000 are broadly urban with low fertility and low mortality. Countries with GDP per capita <$ 1, 000 tend to be rural with high fertility and high mortality.
2.1 Urban-rural differences
The model also produces two other facts observed in the English transition: a declining urban-rural wage premium, and a declining rural-urban family size ratio. I focus on England, because of the quality of its long-run macroeconomic time series, and availability of historical urban and rural data on fertility, mortality, and wages. The model is calibrated to English data in section 5.1.
The urban-rural wage gap declines over time.Footnote 11 Williamson (Reference Williamson1987) calculates a nominal wage gap in the 1830s for unskilled workers of 73% and a real wage gap of 46%; he estimates that the majority of the gap was compensating for high urban mortality. In contrast, D'Costa and Overman (Reference D'Costa and Overman2013) estimate an unconditional wage gap of 14% in Britain from 1998 to 2008. Conditioning on observables such as occupation and skill further reduces the gap to 2%, in line with estimates for other countries.Footnote 12
The rural-urban family size ratio declines over time. Clark (Reference Clark2009) estimates gross fertilities for the 15th–18th century that is 27% higher on farms than in London and 12% higher in other non-farm households than in London. Mortality differences led farm-dwelling fathers to have over twice as many surviving children than a Londoner. And other non-farm fathers had 70% more surviving children than a Londoner. By the turn of the 20th century, [Szreter and Hardy (Reference Szreter, Hardy and Daunton2001), Table 20.6] estimates that rural fertilities were only 3%–5% larger than in urban areas. In modern European countries with available data, rural crude birth rates average 98% of urban rates [United Nations Statistics Divison (2012), Table 9]. This pattern is documented in many countries.Footnote 13
2.2 Early urbanization predicts later transition
This prediction is unique in distinguishing this theory from other models of urbanization and long-run growth. Theories such as Hansen and Prescott (Reference Hansen and Prescott2002) or Lucas (Reference Lucas2004) feature an urban sector with strictly greater returns to scale than the rural sector. In such a model, an economy that is parameterized to choose a higher urbanization level for a given income level will grow faster. The model presented in section 3 also has higher urban returns to scale, but an additional factor in the urban sector inhibits growth: high child mortality. If an early economy is relatively urbanized all else equal, its high child mortality reduces the household budget set, decreasing the return to human capital investment, which delays the income growth transition. Then, over the following transition, growth and urbanization are highly correlated.
In this section, I estimate the relationship between early urbanization and transition timing using cross-country data to document how country characteristics including the preindustrial urbanization rate affect the timing of a country's transition to modern growth.
I construct growth transition years for 43 countriesFootnote 14 for which I have the relevant data in the year 1500. The transition years use the same definitions as in Table 1: the first year that a 50-year moving average of income growth exceeds 1%. Then I regress the transitions years T j against country characteristics in the year 1500:
where s U,0,j is country j's initial urbanization rate, Δy 0,j is their initial per capita real income growth, n 0,j is their initial population growth, D j is a vector of country characteristics for some regression specifications, and ɛ j is the error term. One conclusion of the sensitivity analysis in section 6.1 is that initial income and population growth rates must be controlled for in these regressions, for they are associated with other factors that affect the transition timing, such as the productivity of human capital investment and preference for children.
Income and population data are from The Maddison Project (2013).Footnote 15 For comparability, England's data is also from this source, instead of the superior data used in section 5.1's model calibration. Before 1820, income and population data are centennial, so in a given year (e.g., 1500) growth is the annualized rate over the preceding century. After 1820, income and population data are annual for most countries. Finally, urbanization data is from Bairoch et al. (Reference Bairoch, Batou and Chevre1988) and The Clio Infra Project (2016), interpolated over gap years.Footnote 16
Table 3 reports the baseline results in Column (1). As predicted by the model, initial urbanization predicts a later transition, while higher income and population growth predict an earlier transition. The coefficient on initial urbanization implies that additional 10 percentage points of urbanization should delay the growth transition by 25 years, all else equal. Both the urbanization and income growth rate coefficients are significant at the 5% level or lower, but population growth is not, which is the case for almost every specification of these regressions.
t statistics in parentheses.
*p < 0.1, **p < 0.05, ***p < 0.01.
Table 3 also reports the results of several robustness checks. Column (2) reports the results with no controls, which gives a weaker relationship. Column (3) uses population density as a proxy for urbanization, in case mismeasurement of the historical urbanization rates is correlated with transition timing. But population density also predicts a later transition and the effect is significant at the 1% level. Column (4) includes a vector of geographic controlsFootnote 17 considered by Ashraf and Galor (Reference Ashraf and Galor2011). The effect of urbanization is strengthened in this regression and is significant at the 1% level. Column (5) includes continent fixed effects, which weakens the relationship, although this may be because continents are correlated with colonial status.
To demonstrate that the effect of urbanization on transition timing is independent of the colonial institution channel documented by Acemoglu et al. (Reference Acemoglu, Johnson and Robinson2002), I next run regressions with dummies for colonial history. Specifically, I include a dummy to indicate whether countries were colonized, as well as a dummy to indicate whether countries were colonizers before the industrial revolution (Tables 4 and 5).Footnote 18 The regression in Column (6) of Table 3 includes these colonial fixed effects and estimates a larger effect than in the baseline that is significant at the 5% level. Column (7) includes both colonial fixed effects and geographic controls, demonstrating that the urbanization channel appears robust, even when controlling for both colonial and geographic explanations of reversals of fortune. Finally, Column (8) includes colonial fixed effects as well as continent fixed effects and gives a statistically significant coefficient unlike in Column (5) when the colonial status was not accounted for.
All of these regressions include dummies for countries that were colonized but vary who receives a dummy for being a colonizer. The “Expanded Colonizers” add Turkey, Germany, Russia, and the United States to the baseline set. This classification adds no additional colonies in the set of observations over the inclusion of just Turkey, so regressions with no colonizer dummy for this classification are omitted.
t statistics in parentheses.
*p < 0.1, **p < 0.05, ***p < 0.01.
The year 1500 CE is used to initialize the baseline calibration in section 5.1 because it is the earliest period for which the rich Clio Infra dataset gives urbanization estimates. But the empirical effects of urbanization and income growth on transition timing can be examined for other years. Table 6 reports the baseline regression for many initial years. Urbanization slows the transition for all years, although it is not always significant, particularly in 1800 CE, as countries are approaching their transition date, or in 1000 CE when the sample is small and the data especially poor.
t statistics in parentheses.
*p < 0.1, **p < 0.05, ***p < 0.01.
I also consider alternative measurements of the growth transition timing. The baseline is the first year that the annual growth trend exceeds 1%. Table 7 reports the baseline regression for other thresholds in Columns (2)–(4). Above 1.5%, the relationship is not robust, suggesting that when countries become sufficiently developed, modern factors may overwhelm the early effects. Column (5) reports the regression where transition timing is defined as the year a country passes an income threshold, rather than a growth threshold: US$5,000 in 2008. This effect is also statistically significant at the 5% level. Finally, Columns (6)–(7) report the effects of early urbanization on the demographic transitions: fertility and child mortality. As predicted, early urbanization delays the demographic transitions just as it delays the growth transition, although the effect on total fertility is only significant at the 10% level. The regressions in Columns (5)–(7) have fewer observations because several countries in the baseline sample have not yet met the relevant thresholds and so are excluded from the regressions.
t statistics in parentheses.
*p < 0.1, **p < 0.05, ***p < 0.01.
3. Model
The model economy contains two production sectors: an urban sector where the only input is human capital and a rural sector with human capital and land inputs. The land is in fixed supply, but human capital grows endogenously and is the only source of growth in the model. Households have overlapping generations and parents decide the quantity and quality of their children.
3.1 Production
The rural production sector, denoted with the subscript R, combines human capital and land to produce output. Its production function is:
The rural firms are land-intensive, such as a farm, a mine, or a logger. An individual rural firm chooses human capital $\tilde{h}$ and land $\tilde{l}$.
The urban production sector, denoted with the subscript U, uses only human capital to linearly produce output. Its production function is
Urban firms are relatively less land-intensive than farms, which characterizes most of the nonagricultural sector of the economy. An urban firm might be a factory, a craftsman, or a service firm. An urban firm chooses only human capital $\tilde{h}$.
The unique final good is produced competitively by combining the output of the urban and rural sectors, with the elasticity of substitution ɛ and weighting parameter ζ
Final goods firms choose rural goods $\tilde{x}_R$ and urban goods $\tilde{x}_U$ as inputs.
Firms in all sectors are small and competitive, so they take prices as given. All sectors feature free entry of firms. Let p R denote the intermediate rural good's price and p U the intermediate urban good's price. Normalize the price of the final output good to one. Let r denote the rental rate of land, w Rthe rural wage rate per unit of human capital, and w U the urban wage rate per unit of human capital. Then, a rural firm solves:
An urban firm solves:
A final goods firm solves:
3.2 Households
Individuals live for two periods: in their first period of life, they are children and in the second period they are parents.Footnote 19 Generations overlap within a household: each household consists of one parent and a number of children. The parent makes all of the household's choices, choosing consumption, the number of children, and education spending. The parent must also choose whether to live in an urban or rural area and how much time to dedicate to market work. Households do not own land; rather, I suppose that an infinitesimally small fraction of the population holds all the land and has a negligible impact on aggregate human capital and demographics. This is a useful simplification to avoid keeping track of the distribution of land ownership in addition to the other state variables of the model.Footnote 20
Utility is dynastic, formulated as in Razin and Ben-Zion (Reference Razin and Ben-Zion1975). Parents enjoy present consumption c, their number of surviving children n, which for tractability is not restricted to integers. Parents also care about their dynasty's future wellbeing, represented by discounting the average utility of each child. A parent discounting by β has utility:
where u(c t, n t) is the period utility function, V t is the parent's dynastic utility and $\varsigma _{t + 1}^U V_{U, t + 1} + ( {1-\varsigma_{t + 1}^U } ) V_{R, t + 1}$ is the average dynastic utility of the next generation. In this expression, $\varsigma _{t + 1}^U$ is the share of the household's children who will choose to live in the urban location, V U,t+1 is the dynastic utility of children who will choose the urban location, $1-\varsigma _{t + 1}^U$ is the share who will choose to live in the rural location and V R,t+1 is the dynastic utility of children who will choose the rural location. Parents' preference for quantity of children is driven by their period utility, u(c t, n t), because V t+1 is the average utility of the children, not the total utility of the next generation.Footnote 21
The period utility function u(c, n) increases in both arguments and must be balanced growth compatible so that as the time cost of raising children rises, it is offset by an income effect. When necessary, I assume the functional form from Barro and Sala-i Martin (Reference Barro and Sala-i Martin2004):
where ϕ > 0, σ < 1 and ϕσ < 1. ϕ controls the preference for consumption relative to children, while σ controls substitutability across generations: 1/(1 − σ) is the elasticity of intergenerational substitution.
Parents choose how to allocate their time to three activities: market work (τ c), producing children (τ n), and educating children (τ h). They have one unit of time to allocate to these activities:
Households in sector j ∈ U, R earn wage w j per unit of human capital, per unit of time worked. Income is spent on consumption, so a parent with human capital h working time τ c consumes:
A household choosing time τ n produces n surviving children by:
where parameter α is the productivity for producing children. S j is the fraction of newborns that survive to adulthood in sector j. S j is exogenous from the perspective of the household, but will depend on aggregate human capital, so it may vary over time. Child production is time-intensive, so productivity is not improved by parental human capital.
Parents produce education to increase the human capital of their children. The education produced per child k is denoted by d k. Households may choose to endow children going to different locations with different education levels (although they will not in equilibrium). Therefore $d_U\varsigma ^Un$ is the total education for children headed to urban locations, while $d_R( {1-\varsigma^U} ) n$ is the total education for children headed to rural locations. All child mortality resolves before parents start to invest in their human capital so the number of surviving children n affects the allocation of education, rather than the gross number of children.Footnote 22 Total education produced is linear in the time spent educating τ h, and the productivity of parental time in producing education is proportional to parental human capital h. With the productivity parameter ξ, education is given by
A child's future human capital is increasing in the education it receives. Children are also endowed with their parents' human capital during the childrearing process. This captures the distinction that only some human capital accumulation is an economic decision (education) while other accumulation occurs naturally. Human capital accumulation is assumed to be linear, so for a child who will choose location k, its future human capital $h_k^{\prime}$ is given by
The endowment of parental human capital ensures that human capital growth is non-negative, even if households are constrained, in which there is zero education and thus zero human capital growth. Without this lower bound on human capital accumulation, there is a potential for inescapable poverty traps and other equilibria.
Combining equations (10), (11), (12), and (13) yield the combined budget constraint:
The household's time is used for consumption, education, or producing children. The total value in the numeraire of the household's time is w jh. The value of time spent in the market is what they earn and spend on consumption c. The total cost of time spent investing in human capital is $( ( w_jn) /\xi ) ( {d_U\varsigma^U + d_R( {1-\varsigma^U} ) } )$, and the total cost of time spent producing n children is w jhn/αS j. Production of these goods is linear, so the marginal cost of producing an additional unit of human capital per child is w jn/ξ while the marginal cost of producing an additional child is $( w_j/\xi ) ( {d_U\varsigma^U + d_R( {1-\varsigma^U} ) } ) + ( w_jh/\alpha S_j)$.Footnote 23 Crucially, child mortality and education affect the marginal cost of additional children, but not the marginal cost of human capital. As child mortality improves and education rises relative to the human capital stock, the household is incentivized to substitute from child quantity towards child quality.
3.2.1 The households' problem
The household's problem differs depending on whether the parent was born in a rural or urban location. The rural-born households' problem is to choose location j, consumption c, children n, and the education d k and future human capital $h_k^{\prime}$ of their children who will choose location k, to maximize dynastic utility. Let Λ denote the aggregate state of the economy, and $\varsigma ^U$ the share of children choosing the urban location; then the household's Bellman equation is
subject to the human capital accumulation equations (14), budget constraint (15), location choice set j ∈ U, R, and non-negativity constraints:
Urban-born households face a simpler problem: they do not choose location. I assume there is no reverse migration from urban to rural locations in order to pin down the distribution of households.Footnote 24 The Bellman equation of an urban-born household is
subject to constraints (14), (15), and (17). In this case, the urban share of the households' children is necessarily $\varsigma ^U = 1$, given the assumption of no reverse migration.
Solving the households' problems yields the first-order conditions:
and envelope condition:
The first-order conditions (20) and (21) hold with equality when the household is unconstrained in its choice of d U and d R, respectively.
When the preferences in (9) are applied to first-order condition (19), consumption is a constant share of income:
This also implies that τ c = (1/(1 + ϕ)) is constant for all households. This result is due to the marginal cost of children being proportional to total income and the homotheticity of preferences, which is required for balanced growth compatibility. As total income w jh rises, the income effect exactly offsets the substitution effect, and households spend the same amount of time τ n + τ h on children, although they may reallocate their time between child quantity and human capital investment.
The household has a Euler equation for children choosing each location. Combining the first-order conditions (20) and (21) with the envelope condition (22) gives the Euler equation for children choosing location k:
This Euler equation reveals how child mortality affects the incentive to invest in child quality. When the future survival rate $S_k^{\prime}$ is higher, it increases the budget set in the next period by making children less costly, so that households can consume more with the same level of human capital. Thus, an increase to $S_k^{\prime}$ increases the return on education, which appears on the right-hand side of the Euler equation.
Denote human capital growth by $1 + g_k\equiv ( h_k^{\prime} /h)$. Then the Euler equation can be rewritten using the budget constraint and consumption share in terms of fertilities, human capital growth, and wages:
The left-hand side of equation (25) is marginal utility growth across generations. On the right-hand side, $\tau _c + n_k^{\prime} ( ( 1 + g_k^{\prime} ) /\xi )$ is the return to human capital investment, and ξ/n is the productivity of parental time at producing human capital for each child. The Euler equation holds with equality when households are unconstrained. The right-hand side is the marginal benefit of investing more parental time into education. This benefit decreases in n because when a household has more children, it requires more time to invest each child with a unit of education.
3.3 Aggregates and the distribution of human capital
The state of the economy is determined by the function λ(h), which denotes the measure of households with human capital h. Households are not ex ante heterogeneous; all heterogeneity in this model is captured by the distribution λ(h).
Human capital is distributed heterogeneously because dynasties live different amounts of time in different locations and households in different locations make different choices about child quantity versus quality. This heterogeneity in investment rates across locations is the central mechanism through which urbanization and income growth interact. As often is the case when savings rates are correlated with wealth, households in this model do not aggregate to a representative household, because urban and rural households invest in human capital at different rates and their levels of human capital are potentially correlated with their location. This nontrivial heterogeneity is necessary to study the interaction between the different transitions but adds complexity over other theories of long-run growth with urban and rural locations that admit representative households, such as Laitner (Reference Laitner2000), Hansen and Prescott (Reference Hansen and Prescott2002), or Tamura (Reference Tamura2002).
In particular, the distribution of human capital λ(h) is necessary to track in order to characterize the distribution of people across locations. Market clearing and optimization pin down the allocations of aggregate human capital to urban and rural sectors, but in order to map aggregate human capital stocks into shares of the population living in each location, λ(h) must be known.
The total population in the economy N is:
The measure of households with h in sector j is denoted by λ(h, j), and this is an equilibrium object because sector j is a choice. All households work τ c units of time, so aggregate human capital inputs in the economy are:
and aggregate land is L, a fixed value. Given factor prices w U, w R, r, total income in the economy is:
Let n j denote the fertility choice of a household in sector j. Let h(h′, j) denote the human capital of a household in sector j that would choose h′ for their children. The distribution of households evolves by:
which simply says that the number of households with h′ equals the number of households that chose h′ for their children, times the number of surviving children per household n j.
Child survival $S_j( {\bar{h}} )$ is a function of location j and average human capital, $\bar{h}$:
The dependence on location captures differences in child mortality across urban and rural areas. The dependence on average human capital captures the impact of the technology level on child mortality. This may come in the form of beneficial technological improvements such as clean water, food safety, and medicine.Footnote 25
Assume the function $S_j( {\bar{h}} )$ is increasing in $\bar{h}$ and has common limit for all j:
It must also be that $S_j( {\bar{h}} ) \in [ {0, \;1} ]$ for all $\bar{h} > 0$. A particular form will be estimated in section 5.
4. Equilibrium
4.1 Definition
A competitive equilibrium in this economy consists of sequences for t ≥ 0 of prices, p R, p U, w R, w U, r; aggregate allocations, Y, x U, x R, H U, H R, Z; distribution of household human capital λ(h, j); and household allocations, c(h, j), n(h, j); given initial distribution of human capital λ(h)0 and the aggregate quantity of land L, such that:
4.2 Equilibrium prices
The firms' profit maximization [equations (5), (6), and (7)] implies that equilibrium prices must relate to equilibrium factors by:
The urban sector has linear production, so the free entry condition ensures that w U = p U holds in equilibrium.
4.3 Equilibrium location choice
Rural-born households choose the location that gives them the highest utility. As usual, the household's value function is the maximum of the value of choosing each location. In most models, this upper envelope is not differentiable at the point of indifference. But in this model, the value function is differentiable for indifferent households.
Proposition 1: If households are indifferent between urban and rural locations in equilibrium, then their marginal value of human capital is equal in both locations.
Proposition 1 is proved in Appendix A.1. Marginal value equalization implies a convenient equilibrium condition for the wage premium. Setting the envelope condition (22) equal in both locations, and substituting for consumption by equation (23) yields:
The wage premium is a compensating differential for mortality differences. If urban child survival S U is lower than rural survival, then all else equal equation (34) will imply w U > w R. But in equilibrium, all else is not equal and urban households will change their child-rearing decision n U to partially compensate for a lower survival rate.
An implication of Proposition 1 is that all children in the same rural household receive the same education, i.e., d U = d R. This can be seen from the education first-order conditions (20) and (21); the Proposition implies that if one holds with equality, then the other must hold with equality given that children are indifferent between urban and rural locations. Therefore, for a given household, $h_U^{\prime} = h_R^{\prime}$. This does not imply that urban and rural households make the same choices; it merely implies that a rural parent allocates the same education to each of its children.
It is necessary to check that the premise that households are indifferent between locations holds in equilibrium. Urban-born households are not allowed to choose their location in order to pin down the joint distribution of population and human capital in equilibrium. If equilibrium features migration from rural to urban areas, then both types of households are indifferent between locations, and Proposition 1 holds. When households are indifferent between locations, labor demand determines the allocation of human capital across sectors. This is how dynasties with the same initial human capital may end up in different locations.
4.4 Equilibrium in the limit
In this section, I derive the asymptotic behavior of the economy. I show that the urban share approaches one, and the urban-rural wage, growth, and fertility gaps disappear.
The following propositions are proved in Appendix A.
Proposition 2: If $\lim _{t\to \infty }\bar{h} = \infty$, then the limiting urban-rural wage premium is $\displaystyle{{w_U} \over {w_R}}\to 1$.
Proposition 2 implies that the urban/rural wage gap disappears in the limit. This is because the wage gap is a compensating differential for child mortality differences, which also disappear. This does not imply the urban and rural incomes are equalized in the long run; these wages are paid per unit of human capital, not per worker. Rather, if the urban sector has more human capital per worker, then urban incomes will be higher.
Proposition 3: If $\mathop {{\lim }_{t\to \infty }}\limits_{} \bar{h} = \infty$, lim t→∞n ≥ 1 and ɛ > 1, then the long-run urban share converges to 1.
Proposition 3 implies that the urban and rural sectors produce substitutes (i.e., ɛ > 1) then the share of the population employed in the rural sector goes to zero in the long run. This is a standard result as in Ngai and Pissarides (Reference Ngai and Pissarides2007) or Acemoglu and Guerrieri (Reference Acemoglu and Guerrieri2008). In the knife-edge case, if the final good was aggregated with a Cobb-Douglas production function (ɛ = 1), then both sectors could have non-zero shares in the long run.
Proposition 4: If $\lim _{t\to \infty }\bar{h} = \infty$, lim t→∞n ≥ 1 andɛ > 1, then the limit of both urban and rural wages is $\bar{w\,}\equiv A\zeta ^{\epsilon ( \epsilon -1) }$.
Proposition 4 implies that wages, which are paid per unit of human capital, are not growing or falling in the limit. Therefore, long-run human capital growth $\bar{g}$ and children $\bar{n}$ are determined in the limit by the long-run budget constraint and the long-run steady-state Euler equation:
5. Quantitative analysis
Parameter values are chosen to match key features of the data, an initial condition is chosen to look like England in the year 1500 C.E., and the economy's equilibrium transition path is calculated.
5.1 Calibration
Ten parameters must be calibrated: production parameters A, θ, ζ, and ɛ; preference parameters ϕ, σ, and β; and household parameters α and ξ. Initial conditions must be chosen: land L and population N 0 are normalized to one. All households are initialized with h = 1. The two technology functions $S_U( {\bar{h}} )$ and $S_R( {\bar{h}} )$ must also be characterized. Finally, assume one model period is 25 years. Calibrated values appear in Table 8, chosen to resemble in England in 1500 C.E. England is the calibrated country because England has historical data on urban and rural differences for fertility and mortality.
The rural production parameter θ is set to 0.74 so that the land share of farm income is 26%, the value for England in 1500 C.E. estimated by Clark (Reference Clark2010).
To calibrate the parameters (A, ζ, ɛ, α, ξ, ϕ, σ, β), I target several empirical moments. First, the initial urban share is targeted to 0.064, estimated by Bairoch et al. (Reference Bairoch, Batou and Chevre1988) for England in 1500. Initial human capital growth is targeted to 1.3%, the smoothed 25-year income growth at 1500 CE, in the Broadberry et al. (Reference Broadberry, Campbell, Klein, Overton and Leeuwen2010) data. Long run human capital growth $\bar{g}$ is targeted to 52%, England's 25-year real income growth rate since 1950.
Initial fertility and mortality rates are targeted to estimates from Clark (Reference Clark2009) for England in 1500–1800. Initial urban and rural probabilities of surviving to age 25 are S U,0 = 0.59 and S R,0 = 0.68. The initial ratio of urban to rural surviving children per adult n U,0/n R,0 is targeted to =0.77, the ratio estimated by Clark (Reference Clark2009). This pins down the initial ratio,Footnote 26 while the levels of n R,0 and n U,0 are chosen to target an initial population growth rate of 8.5% per 25 years, which match the growth rate for England from 1400 to 1600 estimated by Broadberry et al. (Reference Broadberry, Campbell, Klein, Overton and Leeuwen2010).Footnote 27 The long-run population growth is targeted to 0%, implying $\bar{n} = 1$.
Five preference and household parameters (ϕ, σ, β, ξ, α) can be solved for jointly, given targets for human capital growth, fertility and mortality, and a target long-run 5% annual rate of return on human capital investment. The five parameters are identified by five equations: the long run and initial rural budget constraints, long run and initial steady-state Euler equations, and the return to human capital investment. The initial rural Euler equation is not identical to the steady-steady Euler equation because there are small movements in wages and net fertilities initially, so the equilibrium value of n R,0 and g R,0 will not exactly match the targets.
The initial urban-rural wage premium is implied by the indifference equation (34). Chosen empirical targets imply an initial premium of (w U,0/w R,0) = 1.23. The initial urban share, normalization of h = 1, and market time of τ c = (1/(1 + ϕ)) imply initial supplies of human capital H R,0 and H U,0. Setting the ratio of marginal products equal to the initial wage premium identifies the weighting parameter ζ in the production function, conditional on a choice of the elasticity of substitution ɛ. Targeting long-run wage $\bar{w} = 1$ then implies a value for TFP A.
The child survival function $S_j( {\bar{h}} )$ requires a functional form. This function should have four properties: $S( {\bar{h}} ) \in ( {0, \;1} )$ for all $\bar{h} \ge 0$, $S_j( {{\bar{h}}_0} )$ matches the target for S j,0, ${S}^{\prime}( {\bar{h}} ) > 0$ for all $\bar{h} \ge 0$, and $S_j( \infty ) = \bar{S}$ so that in the long run, survival approaches a chosen limit. A form satisfying these properties is:
This is a transformed logistic CDF, which is chosen for parsimony as it is governed by only one free parameter υ, and also for having a positive limit as $\bar{h}\to 0$. It satisfies the other desired conditions: when $\bar{h} = \bar{h}_0$, then $S_j( {\bar{h}} ) = S_{j, 0}$; $S_j^{\prime} ( {\bar{h}} ) > 0$; and in the limit as $\bar{h}\to \infty$, then $S_j( {\bar{h}} ) \to \bar{S}$.
The function is estimated on England's child mortality time series, given the targets for S j,0 and $\bar{S}$. Appendix B describes this estimation.
The final parameter to calibrate is the elasticity of substitution ɛ. The elasticity of substitution controls the speed of urbanization as aggregate human capital grows. Figure 2 plots the transition year for urbanization and for income growth as a function of ɛ. A higher value of ɛ speeds the urbanization transition by making urban and rural sectors more substitutable: given a decline in the wage premium, more human capital will shift into the urban sector. But a higher value of ɛ also decreases growth: there are more urban households, which face lower child survival rates and spend less time investing in human capital for their children (see section 5.2).Footnote 28 The dashed lines are the empirical transition years. The elasticity of substitution is selected to minimize the mean squared error between the model and empirical transition years.
5.2 Results
The economy is initialized in 1500 and is run 21 periods to 2000. The economy begins with most of the population in the rural sector. As the population grows and human capital accumulates, households move to the urban sector (Figure 3). The simulated urban share surpasses 50% in the year 1846, versus the empirical urban share which reached 50% around 1863. In the long run, the population fully urbanizes.
As mortality falls, surviving children become cheaper. But increasing the number of children increases the cost of investing a unit of human capital in each child. So, parents reduce fertility and spend more time on human capital investment. Quantitatively, fertility falls more than one for one with the decrease in cost for unconstrained households, so surviving children fall and households substitute from quantity to quality. The income per household grows slowly at first, but eventually rises, asymptoting to the long-run value (Figure 4).
To understand the dynamics of the two sectors, Figure 5 plots the Euler equation in (25) in a steady state:
For the steady-state Euler equation, children choose the same location as their parent. τ c + n ss((1 + g ss)/ξ) is the return on human capital and ξ/n ss is the productivity of parental time in producing a unit of human capital for each child. With calibrated parameter values, the steady-state Euler equation implies that g ss decreases in n ss for $g\in ( {0, \;\bar{g}} ]$.Footnote 29 In this region, households will always trade-off child quantity for quality, and never increase both. Thus, an expansion in the household's budget set caused by declines in child mortality will induce substitution from quantity to quality even though quantity has become cheaper: child quantity is a Giffen good.
To understand this effect, Figure 5 also plots the normalized budget constraint, which divides the budget constraint (15) by total income:
This budget constraint is plotted for three different survival levels: $\bar{S}$, S R,0, and S U,0. The steady-state Euler equation differs slightly from the equilibrium Euler equation for initial urban or rural households, but this figure is a useful approximation for understanding the dynamics. The Euler Equation represents the set of points for which a household's indifference curve over child quantity and quality would be tangent to a budget constraint. So as the rural survival rate improves, the rural budget constraint shifts towards the long-run budget constraint, and the rural allocation moves down the Euler equation, shifting from quantity towards quality. In Figure 5, this corresponds to a movement from point B to point C.
The initial urban budget constraint does not intersect the Euler equation: urban households are constrained at g = 0 (point A) so that the non-negativity constraint 17 is satisfied. As the urban survival rate improves, the urban budget constraint shifts towards the rural budget constraint (point B), and children increase. When the survival rate has improved sufficiently to unconstrain urban households, they follow the rural households and substitute from quantity to quality (point C). Urban households follow the quantity-quality substitution pattern of rural households with a lag because their budget sets for child quantity and quality are strictly smaller. A consequence of this delay is that urban households invest in less human capital than rural households. This pattern contrasts with empirical evidence that urban workers in preindustrial England tended to be more literate than rural workers.Footnote 30
Figure 6 plots the ratio of urban to rural values for three quantities: wages, children, and survival, exhibiting the predictions from section 2.1. As human capital grows, urban and rural survival rates both grow towards the same limit, so the ratio rises to one. The urban-rural wage ratio is the compensating differential for mortality differences. Williamson (Reference Williamson1987) estimates this ratio as 1.46 in the early 1800s, versus 1.05 in the model in 1800 and 1.22 in 1500. Survival is initially lower in urban areas, so a high wage premium is necessary to make households indifferent between locations. As the survival ratio rises to one, wage ratio falls to one, and the compensating differential disappears in the limit. Urban households initially choose fewer children than rural households because they are constrained at g = 0 and urban children are very expensive due to their low survival rate. As the survival rate improves, the urban-rural child ratio rises as urban households have more children and rural households substitute from quantity to quality. Eventually, the urban households become unconstrained and also substitute towards quality. The ratio approaches one in the long run, as the survival differential disappears.
While the urban-rural family size ratio increases from the initial period to the long run, fitting the empirical pattern in section 2.1, it is not monotonic over the whole sample, which may not be true in the data. This non-monotonicity is because urban households choose higher fertilities than rural households, to compensate for high child mortalities. This fertility difference is true empirically in the modern-day, but not during the 19th or early 20th centuries. To explain the fertility ratio over this period, the theory needs further urban-rural differences, such as the cost of raising children in the city, or higher urban returns to human capital [Becker (Reference Becker1981), Chapter 5].
In the aggregate, fertility and mortality fall as the economy urbanizes and transitions to modern growth. Figure 7 plots births, deaths, and the net growth of each cohort. Births are calculated before accounting for the fraction S j that does not survive to adulthood. Total births slowly start to decline with child mortality, as fewer newborns are necessary to produce a given surviving child. In the long run, the birth rate falls to the limiting population growth rate, because child mortality disappears. Similarly, the death rate falls to one in the long run—all adults die every period, and all children live. The difference of these series is the population growth rate which falls to zero in the long run, just as the net number of children produced by each household falls to one.
The child mortality rate is plotted versus the smoothed mortality rate in Figure 8. Mortality falls, albeit not as abruptly as in the data. In contrast, the model's time path for fertility does not match the empirical path nearly as well. This is because, in the model, fertility and child survival determine the population growth rate and the model is calibrated to match the population growth. But in reality, there are other factors (e.g., adult mortality, migration, gender balance) that cause the total fertility rate to move independently of child survival and population growth. As a result, empirical fertility is above 4 in 1500 CE (Figure 1) while initial fertility in the model is just above 3 (Figure 7; the model's total fertility rate is double the birth rate).
6. Model sensitivity
What impacts do the initial conditions have on the equilibrium dynamics? Subsection 6.1 considers the effect of changing initial calibration targets on the transition timing. Subsection 6.2 examines the relationship between urbanization and income over the transition.
6.1 Sensitivity to initial characteristics
The transition timing is sensitive to the initial calibration targets. In particular, three targets have large effects: the initial urban share, the initial human capital growth rate, and the initial population growth rate.
First, I vary the initial urban share target while holding constant the other targets. Varying the initial urban share chiefly operates through production parameters. In general, a change to a calibration target will not have an effect on just a subset of parameters. But the urban share's effects on calibration are relatively straightforward. Raising the initial urban share requires increasing ζ, the weight on urban goods in the final production sector, and decreasing TFP A, to keep the long-run marginal productivity of human capital constant. The elasticity of substitution ɛ is kept constant, for this parameter is identified off of the transition timing. There are small changes to household parameters, which must be adjusted to keep initial population growth at the target level, but these changes are small because s U,0 is small.
Figure 9 plots the year that the model economy surpasses 1% income growth against the initial urban share. All other calibration targets are baseline values. As the initial urban share increases, the growth transition is delayed. Because the economy is more urban, and urban parents choose lower human capital growth for their children, the economy grows more slowly for many centuries. In the long run, the economy catches up to the baseline long-run growth target as urban mortality improves.
Next, I vary only the initial human capital growth target, which primarily affects household parameters. Increasing the initial growth target increases the necessary household productivity of human capital investment ξ, and decreases the productivity of child-rearing α. Intuitively, increasing ξ makes the household richer, but decreasing α raises the relative price of children quantity versus quality. Thus, the initial period household chooses the same initial number of children, but a higher rate of human capital growth. Of course, other parameters must have small adjustments to maintain the long run calibration targets.
Figure 10 plots the year that the model economy surpasses 1% income growth against the initial income growth rate. Other calibration targets are unchanged from the baseline. The transition timing is very sensitive to the initial growth rate. An economy with low initial growth has poor productivity of human capital investment. This decreases the growth rate along the transition and the economy takes longer to converge to the long-run limit. Lower human capital investment has some secondary effects: urbanization is slowed, which increases income growth by shifting the population composition towards the lower mortality rural sector, but the mortality transition is also slowed for both sectors, reducing income growth.
Lastly, increasing the initial population growth target speeds the economy's transition. Higher population growth is mainly achieved by increasing the productivity of childrearing α, but with a decrease in child preference ϕ to maintain the long-run population growth. Because the initial urban households are constrained at g = 0 due to the high child mortality, they spend all of their non-market income-producing children. So, an increase in α disproportionately increases initial urban children relative to rural children. It takes less time for urban households to become unconstrained, and to start substituting from child quantity to quality. The income growth transition year is plotted against the initial population growth rate, all else equal, in Figure 11. A higher population growth rate with the same household human capital growth rate speeds the income growth transition as households substitute to child quality earlier.
6.2 Urbanization and income levels
The analysis in section 6.1 suggests that all else equal, a country will have a faster growth transition if it has: 1. a lower initial urban share, 2. a higher initial income growth rate, or 3. a higher initial population growth rate. The size of these effects is estimated in section 2.2. Testing initial conditions on transition timing support the model's predictions, yet these tests are limited by the small sample of countries with historical data before 1800 CE for all necessary variables, and by the accuracy of these historical estimates. To take advantage of more data, I next conduct a more powerful test of the relationship between early urbanization and transition timing.
In the context of the model, high initial urban shares are interpreted as reflecting high urban productivity relative to rural productivity.Footnote 31 In equilibrium, this results in a higher level of urbanization at every income level, although it may not be higher at every point in time. To illustrate, Figure 12 plots urbanization and income level for the baseline calibration and for an alternative with China's initial urban share of0.12. At every level of income, the alternative has a higher urbanization. Why? The urban-rural wage premium is the compensating differential for the urban-rural mortality ratio. And the mortality ratio falls as the country's human capital rises. Because the urban sector is more productive relative to the rural sector in the alternative calibration, more households must be urban for a given wage premium.
I use a two-stage regression approach to test to see if countries with high rates of urbanization relative to income have later growth transitions, as predicted by the model. This method uses two stages because it must first construct a measurement of the long-run relationship between a country's income and urbanization, before testing the effect on transition timing. The method is not traditional two-stage least squares and does not construct an instrument.
First, I run the following panel regression, for country j in year t:
This is a regression of urban share on log income with country fixed effects. Following the same approach as in section 2, income data are from The Maddison Project (2013) while urbanization rates are constructed from The Clio Infra Project (2016).
Next, I regress the transition year T j on the estimated fixed effects:
Table 9 summarizes the 1st stage estimated country fixed effects. There are 76 countriesFootnote 32 with urbanization data before their income growth transition and 7,795 total year-country observations. The regressor in the second stage is an estimate and analytical standard errors will be incorrect, so standard errors are calculated by bootstrapping.
Table 10 reports the results of the second stage regression. Countries that have a higher urbanization level conditional on their income, transition later. I estimate $\hat{\!\!\psi } = 490.6$: if a country that is 10% more urban conditional on income, then they will transition almost 50 years later.
Notes: Standard Errors calculated by bootstrapping 500 times over 7,795 first stage observations.
t statistics in parentheses.
*p < 0.1, **p < 0.05, ***p < 0.01.
Figure 13 plots countries' first-stage estimated fixed effects versus their transition year and the second stage regression line. Geographic patterns emerge. In the lower left are many Western and Central European Powers and their colonies, which were initially very rural and transitioned early. In the upper right are many Asian countries, including China and India, which were urban early in their development, but transitioned later.
This panel regression approach is consistent with the cross-country estimates from section 2.2: both suggest that countries relatively predisposed towards urbanization will transition to modern growth later, despite the general correlation of urbanization and income growth over time.
7. Concluding remarks
This paper has developed a unified endogenous growth model producing three simultaneous transitions: the growth transition, urbanization, and the demographic transition. The model quantitatively reproduces the timing and magnitude of England's transitions. Because the model considers growth, urbanization, and demographics jointly, it also generates three additional empirical observations: a declining urban-rural wage gap, a declining rural/urban family size ratio, and that early urbanization delays a country's transition.
The relationship between early urbanization and transition timing is an identifying feature of the model which distinguishes it from other theories of urbanization and long-run growth. I use several estimation strategies to show that the relationship between early urbanization and transition timing is robust in the historical experiences of many countries. The key mechanism in the model is the effect of high urban child mortality on human capital accumulation, suggesting that when studying long-run growth, it is essential to consider the interaction between demographic incentives and structural transformation. This finding raises further research questions. Does this channel apply to current low-income countries? Does it reverse at some point as urbanization starts to incentivize human capital accumulation when cities specialize in services and serve as a locus for the transmission of ideas? Future work can address these questions by applying and expanding on the theory in this paper.
Acknowledgement
I am grateful for helpful comments and guidance from Loukas Karabarbounis, Robert Lucas, Brent Neiman, Nancy Stokey, Harald Uhlig, and participants at the University of Chicago's Capital Theory, Applied Macro, and Growth and Development Working Groups, and three anonymous referees.
Appendix A: Proofs
A.1. Proof of proposition 1
In this section, I prove that if households are indifferent between urban and rural locations in equilibrium, then their marginal value of human capital is equal in both locations.
Proof . Consider a household where all children in each future generation make the same location choices. This may be true for an individual dynasty, which is atomistic. Then the household dynastic utility (8) can be expanded into the discounted sum:
Let ${\cal J}$ denote a sequence of location choices, where ${\cal J}( t )$ is the sector chosen in period t. Substituting for the household's consumption choice, dynastic utility becomes:
Normalized human capital h t+k/h t can be expressed in terms of growth rates:
substituting this expression into (A.2) gives V t in terms of sequences of wages, locations, choices of n and g, and h t. Lemma 5 (proved below) says that choices of n and g are independent of h t. So given these sequences, the utility for a location sequence ${\cal J}$ is a function of h, proportional to current human capital to a power:
Now consider two different location sequences ${\cal J}$ and ${\rm {{\cal J}}^{\prime}}$. Because of the proportionality in (A.3), it is true that:
• If a household is indifferent to some $\hat{h}$, then
(A.4)$$V_{\cal J}( h ) = V_{{\rm {{\cal J}}^{\prime}}}( h ) {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} \forall h > 0. $$• If a household strictly prefers ${\cal J}$ for some $\hat{h}$, then
(A.5)$$V_{\cal J}( h ) > V_{{\rm {{\cal J}}^{\prime}}}( h ) {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} \forall h > 0. $$
Suppose households are indifferent between urban and rural locations for some $\hat{h}$. Let ${\cal J}_U$ and ${\cal J}_R$ denote optimal location sequences for a household with $\hat{h}$ given a current period choice of urban or rural location, respectively. The household is indifferent by definition of $\hat{h}$, so $V_{{\cal J}_U}( {\hat{h}} ) = V_{{\cal J}_R}( {\hat{h}} )$. Then it follows from (A.4) and (A.5) that households are indifferent between ${\cal J}_U$ and ${\cal J}_R$ for all $\forall h > 0$, and there is no other sequence of locations that any household strictly prefers.
This sequence indifference implies that for any ${\cal J}\in \{ {{\cal J}_U, \;{\cal J}_R} \}$:
Thus, if households are indifferent for some $\hat{h}$, the marginal value of human capital is equalized in both locations:
Lemma 5: Given a series of wages w t,j, survival rates S t,j, and sequence of locations ${\cal J}( t )$, a dynasty's choice of children n t and human capital growth g t is independent of its level of human capital h t.
The central assumption driving this result is the homotheticity of the balanced growth compatible preferences.
Proof. The combined budget constraint (15) and equilibrium choice of consumption c = τ cw jh imply that the budget constraint can be normalized by dividing by w jh:
and recall that τ c = (1/(1 + ϕ)) is constant. This normalized budget constraint and the Euler equation (25) jointly characterize the household's equilibrium behavior, and neither depends on the level of h.
A.2. Proof of proposition 2
In this section, I prove that if $\lim _{t\to \infty }\bar{h} = \infty$, then the limiting urban-rural wage premium is (w U/w R) → 1.
Proof.
Suppose that S R = S U but w R < w U. Consider the optimal rural allocations $( {c_R, \;n_R, \;h_R^{\prime} } )$ given w R and S R. A household could choose to live in the urban area and, as per the combined budget constraint (15), would be able to afford the allocation $( \tilde{c}_U, \;n_R, \;h_R^{\prime} )$ where $\tilde{c}_U > c_R$. Thus, they would strictly prefer the urban location and this could not be an equilibrium. Similarly, if w R > w U then an urban household could switch to a rural location and be strictly better off. The only possible equilibrium given S R = S U must have w R = w U.
By assumption $\lim _{\bar{h}\to \infty }S_j( {\bar{h}} ) = \bar{S}$ for allj. So, in the limit, it must be that (w U/w R) → 1.■
A.3. Proof of proposition 3
In this section I prove that if $\lim _{t\to \infty }\bar{h} = \infty$, limt→∞ n ≥ 1 and ɛ > 1, then the long-run urban share converges to 1.
Proof.
The limits for $\bar{h}$ and n imply that aggregate human capital $H = N\bar{h}$ grows in the long run: limt→∞ H = ∞.
Use the equilibrium prices in equations (32) and (33) to express the wage premium as:
Then substitute with the sectoral production functions to express the wage premium in terms of human capital inputs:
Aggregate human capital supplied is τ cH. The urban share of aggregate human capital is s U. Substituting and rearranging give:
The agricultural labor share θ is between 0 and 1 by assumption, so if ɛ > 1 then the left-hand side of this equation decreases in H, and the right-hand side decreases in s U. Proposition 2 says that in the limit w U = w R, so if H → ∞, the limit of the left-hand side of this equation is zero. The right-hand side is positive and decreases in the urban share for s U ∈ (0, 1), and
So, it must be that s U → 1.■
A.4. Proof of proposition 4
In this section I prove that if $\lim _{t\to \infty }\bar{h} = \infty$, limt→∞ n ≥ 1 and ɛ > 1, then the limit of both urban and rural wages is $\bar{w}\equiv A\zeta ^{( \epsilon /( \epsilon -1) ) }$.
Proof. Use the final good production function (4) and equilibrium prices in equations (32) and (33) to express the equilibrium urban wage as:
Substitute for intermediate inputs and express human capital inputs in terms of aggregate human capital and the urban share s U:
Take the limit, given that the limits for $\bar{h}$ and n imply H → ∞ and Proposition 3 implies s U → 1:
Appendix B: Survival Function
In this section, I describe the estimation of the survival function. The one-parameter version specification of the survival function is a transformed logistic CDF:
This function is able to hit both the initial target S j,0 and the long-run limit $\bar{S}$. It has all the desired properties: it strictly increases in $\bar{h}$, bounded by $[ {0, \;\bar{S}} ]$, and has finite limits as $\bar{h}\to 0$, and $\bar{h}\to \infty$.
The targets for S R,0 and S U,0 are from Clark (Reference Clark2009). I estimate the survival equation using nonlinear least squares. Child mortality data is from Johansson et al. (Reference Johansson, Lindgren, Johansson and Rosling2015), and the average income is used to approximate average human capital. Non-linear least squares give υ = 0.35 when $\bar{h}_0$ is normalized to one. Figure B.1 .plots England's mortality data, income, and the fitted survival function given the year's income level.
Appendix C: Computation
In this section, I describe my method of calculating the equilibrium. The strategy is to express the equilibrium allocation for each period t as a function of the rural choice of children n R,t, and express the next period's choice n R,t+1 as a function of period t variables. Then, an initial guess for n R,0 is chosen, and a shooting algorithm is used to find the equilibrium value of n R,0 and the following equilibrium allocations for all t.
First, it is useful to rewrite the location indifference condition (34) in terms of allocations instead of wages. This equation says that the right-hand side of the Euler equation for urban and rural households is equal. This implies that if the household is unconstrained, then the left-hand side is also equal, so substituting with equation (25) implies:
Then dividing equation (34) by equation (C.1) yields:
Next, combine equation (C.2) with the normalized budget constraint (39) to yield an equation relating n U, n R, S U, S R, and parameters:
The shooting algorithm proceeds are as follows. Guess a value of n R,0. In period t, n R,t, S R,t, S U,t, and the distribution of human capital Λt are known. In period t = 0, n R,0 is a guess, and S R,0 and S U,0 are calculated from the initial condition for Λ0.
(1) Numerically solve equation (C.3) for n U,t. If the implied value of n U is infeasible, the urban households must be constrained and their Euler equation does not hold, so set n U = (1 − τ c)αS U,t
(2) Analytically solve the normalized budget constraints (39) for g R,t and g U,t.
(3) Calculate the wage premium w U,t/w R,t from the indifference condition (34).
(4) Numerically calculate the aggregate human capitals supplied H R,t and H U,t that are consistent with the wage ratio and the aggregate human capital supplied implied by Λt.
(5) Analytically calculate the wages w R,t and w U,t implied by H R,t and H U,t using the equations for equilibrium prices (32) and (33).
(6) Calculate next period's distribution of human capital Λt+1 from the law of motion (29).
(7) Use Λt+1 to calculate next period's average human capital level and find S R,t+1 and S U,t+1 from equation (37).
(8) Solve numerically for n R,t+1:
(a) Express the next period's wage in location j as a function of n j,t+1 through the Euler equation (25)
(b) Express next period's human capitals supplied H R,t+1 and H U,t+1 as functions of n R,t+1 and n U,t+1, using the equations for equilibrium prices (32) and (33).
(c) Numerically find the values of n R,t+1 and n U,t+1 that imply values of H R,t+1 and H U,t+1 that are consistent with Λt+1.
(d) If n U,t+1 is infeasible, urban households must be constrained, so repeat steps (b) and (c) assuming n U,t+1 = (1 − τ c)αS U,t+1.
(1) Return to step 1. for the period t + 1 ≤ T.
Period T approximates the long run. If the calculated long-run rural children n R,T is within tolerance ɛ to the equilibrium long-run value $\bar{n}$, consider the equilibrium solved. Otherwise, for $n_{R, T} > \bar{n} + \varepsilon$ revise the initial guess downwards, and for $n_{R, T} < \bar{n} + \varepsilon$ revise the initial guess upwards.