NBER WORKING PAPER SERIES
A SPATIAL KNOWLEDGE ECONOMY
Donald R. DavisJonathan I. Dingel
Working Paper 18188http://www.nber.org/papers/w18188
NATIONAL BUREAU OF ECONOMIC RESEARCH1050 Massachusetts Avenue
Cambridge, MA 02138June 2012
We thank Pol Antras, Arnaud Costinot, Jessie Handbury, Walker Hanlon, Sam Kortum, Corinne Low,Ben Marx, Joan Monras, Suresh Naidu, Daniel Sturm, Eric Verhoogen, Reed Walker, David Weinstein,and seminar participants at the CESifo conference on heterogeneous firms in international trade, Columbiaapplied micro and international trade colloquia, NYU, Princeton IES Summer Workshop, Spatial EconomicResearch Centre annual conference, and University of Toronto for helpful comments on various drafts.We thank Paul Piveteau for research assistance. We are grateful to Enrico Moretti and Stuart Rosenthalfor sharing their housing-price measures with us. Dingel acknowledges financial support from theProgram for Economic Research at Columbia University. The views expressed herein are those ofthe authors and do not necessarily reflect the views of the National Bureau of Economic Research.
NBER working papers are circulated for discussion and comment purposes. They have not been peer-reviewed or been subject to the review by the NBER Board of Directors that accompanies officialNBER publications.
© 2012 by Donald R. Davis and Jonathan I. Dingel. All rights reserved. Short sections of text, notto exceed two paragraphs, may be quoted without explicit permission provided that full credit, including© notice, is given to the source.
A Spatial Knowledge EconomyDonald R. Davis and Jonathan I. DingelNBER Working Paper No. 18188June 2012JEL No. F1,F22,J24,J61,R1
ABSTRACT
Leading empiricists and theorists of cities have recently argued that the generation and exchange ofideas must play a more central role in the analysis of cities. This paper develops the first system ofcities model with costly idea exchange as the agglomeration force. Our model replicates a broad setof established facts about the cross section of cities. It provides the first spatial equilibrium theoryof why skill premia are higher in larger cities, how variation in these premia emerges from symmetricfundamentals, and why skilled workers have higher migration rates than unskilled workers when bothare fully mobile.
Donald R. DavisColumbia University, Department of Economics1038 Intl. Affairs Building420 West 118th St.New York, NY 10027and [emailprotected]
Jonathan I. DingelDepartment of Economics, Columbia University420 W. 118th St.New York, NY [emailprotected]
1 Introduction
Cities di!er markedly. They di!er in size, of course. But a large city is much more than the
summation of many small towns. Larger cities have more educated populations and higher
productivity, wages, housing prices, and inequality. These di!erences across cities are not
external facts of nature. They are the result of hundreds of millions of individual decisions,
each made in light of di!erent cities o!ering di!erent jobs, associates, earnings, and costs of
living. What links these individual decisions to the aggregate outcomes we observe in the
cross section of cities? That is the question we address in this paper.
In the last couple of decades, theorists have focused on the role of cities as loci for the
exchange of goods as the agglomeration force in the cross section of cities. This is the “new
economic geography” launched by Krugman (1991). Recently, however, important voices
have argued that the exchange of ideas as an agglomeration force needs to take a more
central role in the discussion. Notably, Krugman (2011, pp. 5-6) writes
How can you de-emphasize technology and information spillovers in a world
in which everyone’s prime examples of localization are Silicon Valley and Wall
Street?. . . The New Economic Geography style, its focus on tangible forces, seems
less and less applicable to the actual location patterns of advanced economies.
Similarly, Glaeser and Gottlieb (2009, p. 983) write
Some manufacturing firms cluster to reduce the costs of moving goods, but this
force no longer appears to be important in driving urban success. Instead, modern
cities are far more dependent on the role that density can play in speeding the
flow of ideas.
This emphasis accords well with empirical evidence suggesting that wages are higher in
larger cities for those with occupations emphasizing cognitive and people skills rather than
motor skills and physical strength (Bacolod, Blum, and Strange, 2009). Studies also suggest
that knowledge exchanges and communication skills are more common and more valuable in
larger cities (Charlot and Duranton, 2004).
Economists have long understood that cities provide an opportunity to learn from others.
Marshall (1890) wrote that in cities the “mysteries of the trade become no mysteries; but are
as it were in the air.” A seminal formalization of this idea treats learning in a city as a pure
local externality (Henderson, 1974). But the influence of research on idea exchange as an
1
agglomeration force has been limited by a “black box” critique. The di"culty is that if ideas
are a pure externality, costlessly available to all in the city, then they are both evanescent in
empirical terms and close to assuming your conclusion in theoretical terms (Fujita, Krugman,
and Venables, 1999, p.4). To advance, we need models of idea exchange that, like the new
economic geography, provide explicit microeconomic foundations.
We will be considering idea exchange among heterogeneous workers. A number of recent
contributions have sought to explain di!erences in outcomes for skilled and unskilled workers
across cities by appealing to exogenous di!erences in fundamental characteristics of those
cities.1 We instead follow the example of the new economic geography: spatial heterogeneity
across cities emerges from perfectly symmetric fundamentals.2
We have emphasized that location is a choice. When individuals choose their locations
freely and optimally, a system of cities is in “spatial equilibrium.”3 Glaeser and Gottlieb
(2009) refer to spatial equilibrium as “the field’s central theoretical tool.” Similarly, Moretti
(2011) stresses the importance of spatial equilibrium as a necessary condition for thinking
about long-term spatial patterns. We note this because there is a counter tradition that uses
observed di!erences in movement between skilled and unskilled workers as reason to assume
that unskilled workers are immobile. In models of long-run spatial outcomes, di!erential
movement should be a result, not an assumption.
In this paper, we develop the first system of cities model in which costly exchange of
ideas is the agglomeration force. Our model is consistent with a broad set of established
facts about the cross section of cities. It provides the first spatial-equilibrium account of
why skill premia rise with city sizes. Our model also provides the first spatial-equilibrium
account of how variation in such skill premia may arise from symmetric fundamentals. We
provide the first explanation of why skilled workers move more than unskilled workers when
both are mobile. Our approach is su"ciently flexible that it can be adapted to address a
variety of questions about the spatial organization of activity within and between cities.
1For example, Glaeser (2008) and Beaudry, Doms, and Lewis (2010) model skill-segmented housingmarkets and skill-biased housing supplies to explain spatial variation in skilled wage premia. Gyourko,Mayer, and Sinai (2006) and Eeckhout, Pinheiro, and Schmidheiny (2010) model exogenous di!erences inhousing supply elasticities and city-level productivities, respectively.
2We do not reject the idea that so-called “first nature” fundamental di!erences across locations haveinfluenced and continue to influence population patterns. For example, Glaeser (2005) traces how thegeographic advantages of the obscure Dutch trading outpost of New Amsterdam helped it become thecolossus of New York City. But these are not the proximate forces that, for example, led Google to recentlybuy one of the city’s largest buildings. Much more likely is that Manhattan provides Google with valuableopportunities to interact with others.
3Abdel-Rahman and Anas (2004) survey the literature on systems of cities.
2
Understanding the sources of di!erences in the cross section of cities is of considerable
importance in its own right (Glaeser, 2008; Glaeser and Gottlieb, 2009). This importance
is amplified by the fact that many fields of economics also use the cross section of cities
and regions as a laboratory for testing theories beyond the traditional bounds of urban and
regional economics.4 A clearer understanding of the forces shaping key economic patterns
in the cross section of cities will provide a stronger foundation for studies making use of this
variation.
1.1 Idea exchange
Our model of idea exchange is in the spirit of Lucas (1988). He wrote
Most of what we know we learn from other people. We pay tuition to a few of
these teachers. . . but most of it we get for free, and often in ways that are mutual
– without a distinction between student and teacher. (p.38)
We develop this in several respects. First, we make explicit that the knowledge acquired
in these exchanges is not really free. The opportunity cost is time not devoted to other
productive activities. In our model, agents choose their time allocation optimally. Second,
since much knowledge is tacit, requiring face-to-face communication, we treat cities as the
loci of learning communities.5 Third, we use a continuous distribution of heterogeneous
labor. Because what one has to o!er other learners and what one can learn oneself varies
across these individuals, spatial sorting of learners into distinct cities with distinct learning
opportunities is quite natural. Finally, in addition to having learning depend on the average
ability of learners in one’s community, we have it depend as well on the mass of learners (cf.
Glaeser 1999). A solitary genius is not enough.
Our approach unites two strands of literature on the exchange of ideas. One has focused
on spatial choices of learning opportunities when knowledge spillovers are exogenous and
freely available within a city (Henderson, 1974; Black, 1999). Another has focused on choices
of learning activities within a single location of exogenous population (Helsley and Strange,
4Recent examples include Albouy (2009) on federal taxation of nominal income, Autor and Dorn (2012) onthe polarization of jobs, Beaudry, Doms, and Lewis (2010) on the introduction of computers as a technologicalrevolution, and Nakamura and Steinsson (2011) on fiscal stimulus in a monetary union.
5This is in line with Lucas’s observation: “What can people be paying Manhattan or downtown Chicagorents for, if not for being near other people?” (Lucas, 1988, p.39) For more on how proximity facilitatesknowledge transmission, see Ja!e, Trajtenberg, and Henderson (1993), Audretsch and Feldman (2004), andArzaghi and Henderson (2008).
3
2004; Berliant, Reed III, and Wang, 2006; Berliant and Fujita, 2008; Lucas and Moll, 2011).6
In our model, locational choices shape knowledge exchanges because learning opportunities
are heterogeneous and depend upon the time-allocation decisions of the learners in each
location. Our characterization of idea exchanges is simple compared to those presented in
the second strand of literature, but this allows us to tractably model endogenous exchanges
of ideas in a system of cities.
An issue that is unavoidable when considering endogenous idea exchange is how one will
treat labor heterogeneity. One possibility is to work with purely homogeneous labor, so
that all exchange is purely horizontal. A second possibility is to work with two classes of
labor, skilled and unskilled. We take heterogeneity to its limit and consider a continuum of
labor types. Common experience tells us that even PhDs from elite universities are highly
heterogeneous in their knowledge and skills. This holds a fortiori when we consider the
very wide range of labor in the economy as a whole. Moreover, we will show that this labor
heterogeneity is of considerable analytic convenience in making sense of important features
of the cross-city data.
1.2 Idea exchange and the cross section of cities
We embed our process of idea exchange in a perfectly competitive economic environment.
Inter-city trade costs are zero or infinite. Cities are sites where producers interact in order
to acquire productivity-increasing ideas. Our model features people who are heterogeneous
in a single dimension. In the core model, there are two produced goods, tradables and non-
tradables. Tradables production makes use of the underlying heterogeneity of individuals;
non-tradables production does not. By comparative advantage, as in Roy (1951), high-ability
individuals sort into the tradables sector. In the tradables sector, individuals can divide their
time between directly producing the homogeneous tradable good and raising their produc-
tivity by exchanging ideas with others in their city who also devote time to learning. All
tradables producers find attractions in large, high-ability cities where learning opportuni-
ties are greatest. However, congestion leads to high prices for housing and non-tradable
services. A tradable producer’s productivity gains from idea exchanges are supermodular
in own ability and a city’s learning opportunities, so tradables producers sort across cities.
6Glaeser (1999) is an important precursor to our approach. His model specifies two locations, a city anda rural hinterland. In contrast to our approach, the fundamental di!erence between the two locations isexogenous, since learning is possible only in the city.
4
Larger cities are populated by higher-ability individuals who, in equilibrium, devote more
time to exchanging ideas. Non-tradables are produced in every city by the least able agents
who are exactly compensated for cities’ price di!erences in housing and non-tradables.
Our model matches a broad set of facts from the empirical literature. First, cities exhibit
substantial heterogeneity in size, as required by the literature on the city-size distribution
(Gabaix, 1999). While our model has symmetric fundamentals, it generically yields asym-
metric outcomes. Second, these size di!erences are accompanied by di!erences in wages,
housing prices, and productivity (Glaeser, 2008). Our model’s agglomeration and congestion
forces link these components together in equilibrium so that larger cities are more expensive
and more productive. Third, while there is evidence that a meaningful share of spatial wage
variation is attributable to spatial sorting of heterogeneous workers (Combes, Duranton, and
Gobillon, 2008; Gibbons, Overman, and Pelkonen, 2010; De la Roca, 2012), this sorting is
incomplete and individuals of many skill types are present in every city. The Roy-model
component of our approach yields this imperfect sorting, since there is sorting within trad-
ables producers but not within non-tradables producers. Fourth, people are highly mobile
in advanced economies and respond to spatial arbitrage opportunities (Borjas, Bronars, and
Trejo, 1992; Dahl, 2002; Notowidigdo, 2011). Our model follows the spatial-equilibrium
tradition in assuming zero mobility costs.
Our emphasis on labor heterogeneity naturally yields predictions about spatial variation
in wage inequality. Workers in the skilled tradables sector can raise their productivity by ex-
changing ideas. In equilibrium, larger cities o!er more valuable idea-exchange environments,
so higher-ability tradables producers locate there and benefit more from idea exchanges. Our
focus on the within-group heterogeneity of skilled workers matches findings that attending a
higher quality college is particularly associated with higher wages in larger cities (Bacolod,
Blum, and Strange, 2009) and that larger cities exhibit greater within-group wage dispersion
(Baum-Snow and Pavan, 2011). Making idea exchange among skilled tradables producers the
agglomeration force also links cities’ population sizes and skill premia. The spatial sorting
of skilled tradables producers yields a positive premium-population relationship.
Theoretically linking together cities, ideas, and skill premia is non-trivial. Unlike tem-
poral di!erences in wage premia, spatial di!erences in wage premia are disciplined by a
no-arbitrage condition. As Glaeser (2008, p.85) notes, when people are mobile, di!erences
in productivity “tend to show up exclusively in changes in quantities of skilled people, not
in di!erent returns to skilled people across space.” The canonical spatial-equilibrium model,
5
in which there are two homogeneous skill groups and preferences are homothetic, predicts
that skill premia are spatially invariant (Black, Kolesnikova, and Taylor, 2009).
In short, spatial theory lags behind the empirical evidence. Glaeser, Resseger, and Tobio
(2009, p.639) state that “we are much more confident that di!erences in the returns to skill
can explain a significant amount of income inequality across metropolitan areas than we
are in explaining why areas have such di!erent returns to human capital.” We provide an
explanation by modeling cities, heterogeneous skills, and ideas.
We are not aware of a prior spatial-equilibrium model that links skill premia to cities’
sizes. Nor are we aware of a prior spatial-equilibrium model that generates spatial variation
in skill premia from symmetric fundamentals. Previous system of cities models amended the
canonical model by introducing spatial variation in fundamentals, namely skill-segmented
housing markets and skill-biased housing supplies, in order to explain spatial variation in
skill premia (Glaeser, 2008; Beaudry, Doms, and Lewis, 2010). These neoclassical models
did not relate skill premia to city sizes.
Our model’s results about city size and wage inequality are related to recent theoretical
work by Behrens, Duranton, and Robert-Nicoud (2010) and Behrens and Robert-Nicoud
(2011). These authors also focus on labor heterogeneity by using a continuum of abili-
ties. Their work di!ers in two important respects. First, they model agglomeration driven
by the exchange of goods. This emphasis potentially complements our study of idea ex-
change. Second, their explanations of cross-city inequality di!erences stem from assuming
that laborers make irreversible one-time locational choices.7 Our model provides the first
spatial-equilibrium explanation of these phenomena.
Wage inequality and city size are strongly linked in the data. Glaeser, Resseger, and
Tobio (2009) and Behrens and Robert-Nicoud (2011) report that larger cities exhibit higher
Gini coe"cients; Baum-Snow and Pavan (2011) show that they have greater overall variance
in nominal wages. In this paper, we focus on the skilled wage premium, a relative price
that captures important dimensions of wage inequality. Figure 1 demonstrates that skill
premia, measured as di!erences in average log weekly wages between college graduates and
high school graduates, are higher in more populous metropolitan areas.8 The scatterplot
shows substantial cross-city variation in skill premia and that a large share of this variation
7All workers entering a city have identical abilities in these models. Upon choosing a city, workersrandomly draw their productivity levels. Behrens and Robert-Nicoud (2011) note that allowing for mobilityin their model “would imply that a city’s equilibrium income distribution is independent of its size.”
8Appendix Figure A1 of Baum-Snow and Pavan (2011) also appears to suggest this relationship.
6
Figure 1: Skill premia and metropolitan populations, 2000
Abilene, TX
Akron, OH
Albany, GA Albany--Schenectady--Troy, NYAlbuquerque, NM
Alexandria, LA
Allentown--Bethlehem--Easton, PA
Altoona, PA
Amarillo, TX
Anchorage, AK
Ann Arbor, MIAnniston, AL
Appleton--Oshkosh--Neenah, WI
Asheville, NC
Athens, GA
Atlanta, GA
Atlantic--Cape May, NJAuburn--Opelika, AL
Augusta--Aiken, GA--SC Austin--San Marcos, TXBakersfield, CA
Baltimore, MD
Bangor, ME
Barnstable--Yarmouth, MA
Baton Rouge, LA
Beaumont--Port Arthur, TX
Bellingham, WA
Benton Harbor, MIBergen--Passaic, NJ
Billings, MT
Biloxi--Gulfport--Pascagoula, MS
Binghamton, NY
Birmingham, AL
Bloomington, IN
Bloomington--Normal, ILBoise City, ID
Boston, MA--NH
Boulder--Longmont, CO
Brazoria, TXBremerton, WA
Bridgeport, CT
Brockton, MA
Brownsville--Harlingen--San Benito, TX
Bryan--College Station, TXBuffalo--Niagara Falls, NY
Burlington, VT
Canton--Massillon, OH
Casper, WY
Cedar Rapids, IA
Champaign--Urbana, IL
Charleston--North Charleston, SCCharleston, WV
Charlotte--Gastonia--Rock Hill, NC--SC
Charlottesville, VA
Chattanooga, TN--GA
Cheyenne, WY
Chicago,
Chico--Paradise, CA
Cincinnati, OH--KY--IN
Clarksville--Hopkinsville, TN--KY
Cleveland--Lorain--Elyria, OH
Colorado Springs, CO
Columbia, MO
Columbia, SC
Columbus, GA--AL
Columbus, OH
Corpus Christi, TX
Cumberland, MD--WV
Dallas, TX
Danbury, CT
Danville, VA
Davenport--Moline--Rock Island, IA--ILDayton--Springfield, OHDaytona Beach, FL
Decatur, ALDecatur, IL
Denver, CO
Des Moines, IADetroit, MI
Dothan, AL
Dover, DE
Duluth--Superior, MN--WI
Dutchess County, NY
Eau Claire, WI
El Paso, TX
Elkhart--Goshen, IN
Elmira, NY
Erie, PA
Eugene--Springfield, OR
Evansville--Henderson, IN--KYFargo--Moorhead, ND--MN
Fayetteville, NC
Fayetteville--Springdale--Rogers, AR
Fitchburg--Leominster, MAFlagstaff, AZ--UT
Flint, MIFlorence, AL
Florence, SC
Fort Collins--Loveland, CO
Fort Lauderdale, FL
Fort Myers--Cape Coral, FL
Fort Pierce--Port St. Lucie, FLFort Smith, AR--OK
Fort Walton Beach, FL
Fort Wayne, IN
Fort Worth--Arlington, TX
Fresno, CA
Gadsden, AL
Gainesville, FL
Galveston--Texas City, TX
Gary, IN
Glens Falls, NYGoldsboro, NC
Grand Forks, ND--MN
Grand Junction, CO
Grand Rapids--Muskegon--Holland, MI
Great Falls, MT
Greeley, CO
Green Bay, WI
Greensboro--Winston-Salem--High Point, NC
Greenville, NC
Greenville--Spartanburg--Anderson, SC
Hagerstown, MD
Hamilton--Middletown, OH
Harrisburg--Lebanon--Carlisle, PAHartford, CT
Hattiesburg, MSHickory--Morganton--Lenoir, NC
Honolulu, HIHouma, LA
Houston, TX
Huntington--Ashland, WV--KY--OH
Huntsville, AL
Indianapolis, IN
Iowa City, IA
Jackson, MI
Jackson, MS
Jackson, TN
Jacksonville, FLJacksonville, NC
Jamestown, NY
Janesville--Beloit, WI
Jersey City, NJ
Johnson City--Kingsport--Bristol, TN--VAJohnstown, PA
Joplin, MO
Kalamazoo--Battle Creek, MI
Kankakee, IL
Kansas City, MO--KS
Kenosha, WI
Killeen--Temple, TX
Knoxville, TN
Kokomo, IN
La Crosse, WI--MN
Lafayette, LA
Lafayette, IN
Lake Charles, LA Lakeland--Winter Haven, FLLancaster, PA
Lansing--East Lansing, MI
Laredo, TXLas Cruces, NM
Las Vegas, NV--AZ
Lawrence, KS
Lawrence, MA--NH
Lawton, OK
Lewiston--Auburn, MELexington, KY
Lima, OH
Lincoln, NE
Little Rock--North Little Rock, ARLongview--Marshall, TX
Los A
Louisville, KY--IN
Lowell, MA--NHLubbock, TX
Lynchburg, VAMacon, GA
Madison, WI
Manchester, NH
Mansfield, OH
McAllen--Edinburg--Mission, TX
Medford--Ashland, OR
Melbourne--Titusville--Palm Bay, FLMemphis, TN--AR--MS
Merced, CA
Miami, FL
Middlesex--Somerset--Hunterdon, NJ
Milwaukee--Waukesha, WIMinneapolis--St. Paul, MN--WI
Missoula, MT
Mobile, AL
Modesto, CA
Monmouth--Ocean, NJ
Monroe, LA
Montgomery, AL
Muncie, IN
Myrtle Beach, SC
Naples, FL
Nashua, NH
Nashville, TN
Nassau--Suffolk, NY
New Bedford, MA New Haven--Meriden, CTNew London--Norwich, CT--RI
New Orleans, LANew Y
Newark, NJ
Newburgh, NY--PA Norfolk--Virginia Beach--Newport News, VA--NC
Oakland, CA
Ocala, FL
Odessa--Midland, TX
Oklahoma City, OK
Olympia, WA
Omaha, NE--IA
Orange County, CA
Orlando, FL
Owensboro, KY
Panama City, FL
Parkersburg--Marietta, WV--OH Pensacola, FL
Peoria--Pekin, IL
Philadelphia, PA--NJ
Phoenix--Mesa, AZ
Pine Bluff, AR
Pittsburgh, PA
Pittsfield, MA
Portland, ME
Portland--Vancouver, OR--WA
Portsmouth--Rochester, NH--ME
Providence--Fall River--Warwick, RI--MAProvo--Orem, UT
Pueblo, CO
Punta Gorda, FL
Racine, WI
Raleigh--Durham--Chapel Hill, NC
Rapid City, SD
Reading, PARedding, CA
Reno, NV
Richland--Kennewick--Pasco, WA
Richmond--Petersburg, VARiverside--San Bernardino, CA
Roanoke, VA
Rochester, MN
Rochester, NY
Rockford, IL
Rocky Mount, NC
Sacramento, CA
Saginaw--Bay City--Midland, MI
St. Cloud, MN
St. Joseph, MO
St. Louis, MO--IL
Salem, OR
Salinas, CASalt Lake City--Ogden, UT
San Angelo, TX
San Antonio, TX
San Diego, CA
San Francisco, CA
San Jose, CA
San Luis Obispo--Atascadero--Paso Robles, CA
Santa Barbara--Santa Maria--Lompoc, CA
Santa Cruz--Watsonville, CA
Santa Fe, NM
Santa Rosa, CA
Sarasota--Bradenton, FL
Savannah, GA
Scranton--Wilkes-Barre--Hazleton, PA Seattle--Bellevue--Everett, WA
Sharon, PA
Sheboygan, WI
Sherman--Denison, TX
Shreveport--Bossier City, LA
Sioux City, IA--NE
Sioux Falls, SD
South Bend, IN Spokane, WASpringfield, IL
Springfield, MO
Springfield, MA
Stamford--Norwalk, CT
State College, PASteubenville--Weirton, OH--WV
Stockton--Lodi, CA
Sumter, SC
Syracuse, NY
Tacoma, WA
Tallahassee, FL
Tampa--St. Petersburg--Clearwater, FL
Terre Haute, IN
Texarkana, TX--Texarkana, AR
Toledo, OH
Topeka, KS
Trenton, NJ
Tucson, AZTulsa, OK
Tuscaloosa, ALTyler, TX
Utica--Rome, NYVallejo--Fairfield--Napa, CA
Ventura, CA
Victoria, TX
Vineland--Millville--Bridgeton, NJ
Visalia--Tulare--Porterville, CA
Waco, TX
Washington, DC--MD--VA--
Waterbury, CT
Waterloo--Cedar Falls, IA
Wausau, WI
West Palm Beach--Boca Raton, FL
Wheeling, WV--OH Wichita, KS
Wichita Falls, TX
Williamsport, PAWilmington--Newark, DE--MDWilmington, NC
Worcester, MA--CTYakima, WA
Yolo, CA
York, PA
Youngstown--Warren, OH
Yuba City, CAYuma, AZ
.2.3
.4.5
.6.7
.8Sk
ill pr
emiu
m
11 12 13 14 15 16MSA log population
Note: The skill premium is the di!erence in average log weekly wages between full-time, full-year employeeswhose highest educational attainment is a bachelor’s degree and those whose is a high school degree in a(primary) metropolitan statistical area. See appendix C for a detailed description of the data and estimation.
is explained by cities’ sizes. College wage premia range from about 47% in metropolitan
areas with 100,000 residents to about 71% in places with 10 million residents.
Prior work on spatial variation in skill premia has studied how skill premia correlate
with other city characteristics, such as the fraction of the population possessing a college
degree (Glaeser, 2008; Glaeser, Resseger, and Tobio, 2009; Beaudry, Doms, and Lewis, 2010)
or housing prices (Black, Kolesnikova, and Taylor, 2009). Table 1 shows that the positive
premium-population relationship is robust to controlling for these other characteristics. Fur-
thermore, the relationship does not depend on whether we measure skill premia controlling
for individuals’ observable characteristics or not. The positive correlation between cities’
population sizes and skill premia is a robust, persistent, first-order feature of the data that
requires a spatial-equilibrium explanation.9
9Regressions for 1990 and 2007 also demonstrate a strongly positive premium-population relationship.See appendix C.2. This spatial pattern does not appear to be a temporary or disequilibrium phenomenon.
7
Table 1: Skill premia and metropolitan characteristics, 2000
Skill premialog population 0.033** 0.029** 0.036** 0.028**
(0.0038) (0.0056) (0.0046) (0.0054)log rent 0.031 0.097**
(0.036) (0.037)log college ratio -0.029 -0.065**
(0.021) (0.019)
R2 0.156 0.160 0.166 0.192Composition-adjusted skill premialog population 0.028** 0.030** 0.031** 0.029**
(0.0033) (0.0050) (0.0039) (0.0049)log rent -0.015 0.017
(0.035) (0.035)log college ratio -0.025 -0.032
(0.018) (0.017)
R2 0.154 0.155 0.164 0.165Observations 325 325 325 325
Robust standard errors in parentheses** p<0.01, * p<0.05
Note: Each column reports an OLS regression. In the upper panel, the dependent variable is a metropolitanarea’s skill premium, measured as the di!erence in average log weekly wages between college and high schoolgraduates. The lower panel uses composition-adjusted skill premia. See appendix C for a detailed descriptionof the data and estimation.
1.3 Spatial equilibrium and skill patterns of migration
Our aim in this paper is to understand the spatial choices of skilled and unskilled workers as
well as the observable, heterogeneous consequences of these choices. One prominent contrast
between skilled and unskilled workers is that the skilled migrate more frequently than the
unskilled (Greenwood, 1997; Molloy, Smith, and Wozniak, 2011). Table 2 demonstrates that
prime working age US-born individuals who change residences are nearly 70% more likely
to change metropolitan areas if they hold a bachelor’s degree rather than just a high school
degree. Moreover, bachelor’s degree holders move farther when they change residences. The
typical move of a college graduate is about 80% greater than that of a high school graduate.10
Even if we compare only those who change metropolitan areas, college graduates move more
than 25% farther than high school graduates.
10In this calculation, we assign a distance of zero to residence changes within the same public-use microdataarea. See appendix C for details.
8
Table 2: Educational attainment and migration
High school degree Bachelor’s degreeDi!erent residence than five years prior 42% 48%Di!erent metropolitan area | di!erent residence 19% 32%Average distance (km) | di!erent residence 204 365
Standard error (0.9) (1.4)Average distance (km) | di!erent metropolitan area 771 977
Standard error (3.3) (3.7)Note: The sample is made up of US-born individuals ages 30–55 residing in metropolitan areas inthe 2000 Census public-use microdata whose highest educational attainment is a bachelor’s degree ora high school degree. See appendix C for details.
How shall we incorporate this contrast in movement of the skilled and unskilled into
our thinking about spatial patterns of activity across cities? One answer is embodied in
the Krugman (1991) core-periphery model, which translates the observation of di!erential
movement into an assumption of di!erential mobility. This has been extremely influential in
subsequent work and so deserves careful attention.11 This has two key shortcomings. The
first is that if the fundamental problem that one wants to address is the spatial pattern of
economic activity, location has to be a choice, not an assumption.12 Second, since many
of these models assume that labor is homogeneous within a broad class, this also has im-
portant consequences for welfare. In particular, perfectly mobile skilled workers receive the
same utility everywhere. Perfectly immobile unskilled workers receive utility that varies by
location, but only because they are assumed unable to move.
We develop a simple dynamic extension of our model that considers costly migration
in the limit as those common costs for skilled and unskilled workers converge to zero, that
is, as we converge to full spatial equilibrium. We believe this extension provides important
advances on the prior literature. The greater rate of movement of skilled than unskilled,
as well as the greater average distance of moves by the skilled, is a result rather than an
assumption. Moreover, because we can explain the facts in spatial equilibrium, our model
does not rely on a failure of arbitrage to make sense of spatial welfare heterogeneity.13
11See, for example, Tabuchi and Thisse (2002) and Borck, Pfluger, and Wrede (2010). Autor and Dorn(2012) also make this assumption. Helpman (1998) shows how the results of Krugman (1991) are altered bymodeling the centrifugal force as housing supplies rather than immobile “peasants.”
12Assuming immobility precludes other explanations for lack of movement, a point underscored in No-towidigdo (2011).
13We recognize that short-run responses to economic shocks may be highly localized due to movement
9
2 A spatial knowledge economy
This section develops a simple spatial knowledge economy, explores the basic model’s relation
to important empirical regularities in the cross section of cities, and then extends it to a
simple dynamic model of migration and outsourcing to explore di!erential movement of the
skilled and unskilled.
2.1 Consumption
Individuals consume three goods: tradables, non-tradable services, and (non-tradable) hous-
ing. Services and housing are necessities; after consuming s units of non-tradable services
and one unit of housing, consumers spend all of their remaining income on tradables, which
we use as the numeraire.14 The indirect utility function, therefore, for a consumer with
income I facing prices p in city c is
V (p, I) = Ic " ps,cs " ph,c (1)
Consumers are perfectly mobile across cities and jobs, so their locational and occupational
choices maximize V (p, I).
2.2 Production
2.2.1 Housing and non-tradable services
Every system of cities model must have both agglomeration and congestion forces. Since our
contribution will focus on the force for agglomeration, we model the congestion force in the
most stripped-down way possible. Alonso (1964), Mills (1967), and Muth (1969) developed
a simple model of the internal structure of the city in which residents commute from home
to a central business district. We follow Behrens, Duranton, and Robert-Nicoud (2010) and
introduce this in a standard form.15 Each location is endowed with housing sites that serve
costs (Autor, Dorn, and Hanson, 2011). Still, we believe spatial equilibrium is the right starting point for ananalysis of long-run spatial patterns, which may be stable across decades or longer. Moreover, one cannotmeasure the speed at which spatial arbitrage occurs without the baseline provided by a model in which sucharbitrage is costless.
14This specification, in which consumers demand a fixed quantity of non-tradables, is also found in Glaeser,Gyourko, and Saks (2006) and Moretti (2011). We use it for analytical convenience; it is not crucial to ourresults. See also footnote 23.
15See appendix section A.1 for details.
10
as residences and that do not require any labor input. We will refer to ph,c as the consumer
price of housing in city c, but the reader should keep in mind that this incorporates both
land rents and commuting costs and is invariant across locations within a city. This yields
a simple increasing relation between housing prices, ph,c, and a city’s population, Lc, of the
form ph,c = !L!c , with !, " > 0.
There is a mass L of workers of heterogeneous ability, indexed by z and distributed with
density µ(z). They choose to produce non-tradables or tradables. Non-tradables can be
produced at a uniform level of productivity by anyone employed in that sector. Tradables,
by contrast, make use of the underlying heterogeneity. A person’s productivity in tradables
is z(z, Zc), which is increasing in z and depends on the learning opportunities available
through interacting with others working in the tradable sector in that city, governed by Zc
(discussed below).
By comparative advantage, low-z people will specialize in producing non-tradables, which
make no use of the underlying heterogeneity, while high-z people will specialize in tradables.
Denote the marginal worker indi!erent between the two sectors as zm.
We choose units of output so that an individual’s productivity in non-tradable services
is unity. Since productivity in non-tradable services is independent of individual ability, the
total output of services in a city is equal to the mass of agents working in services, Ls,c. The
income of a non-tradables producer in city c is therefore ps,c.
2.2.2 Idea exchange and tradables productivity
Tradables producers can acquire knowledge to increase their productivity. They do this by
spending time interacting with other tradables producers in their city. Each person has
one unit of time that they divide between interacting and producing. Production depends
on own ability (z), time spent producing (#), time spent exchanging ideas (1 " #), the
productivity benefits of learning (A), and local learning opportunities (Zc). Exchanging
ideas is an economic decision, because time spent interacting (1 " #) trades o! with time
spent producing output directly (#). The tradables output of an agent of ability z is
z(z, Zc) = max"![0,1]
#z(1 + (1 " #)AZcz) (2)
A is a parameter common to all locations that indexes the scope for productivity gains
from interactions. When A is higher, conversations with other agents raise productivity
11
more. Knowledge has both horizontal and vertical di!erentiation. Horizontal di!erentiation
implies that producers can learn something from anyone. Vertical di!erentiation means that
they learn more from more able counterparts.
Local learning opportunities Zc are the result of a random-matching process in which
producers devoting time to idea exchanges encounter other producers doing likewise. The
expected value of devoting a unit of time to idea exchange in a city is the probability of
encountering another individual times the expected ability of the individual encountered.
The probability of encountering a person during time spent seeking idea exchanges is
m(Mc), where Mc is the total time devoted to learning by producers in the city. m(·) is
an increasing function, with m(0) = 0 and m(#) = 1. Like Glaeser (1999), we assume
that face-to-face interactions occur with greater frequency in denser places, so that random
matches occur more often in the central business districts of larger cities. In our setting the
population of agents available for such encounters is determined endogenously by tradable
producers’ time-allocation choices.
The expected ability of the individual encountered is zc, the weighted average of the abil-
ities of producers participating in idea exchanges. The weights are the time agents devote to
interactions.16 Conditional on meeting another learner, the scope for gains from interactions,
and one’s own ability, conversations with more talented agents are more productive.
Thus, the value of local learning opportunities Zc reflects both a scale e!ect and an
average ability e!ect. Consider city c with population ability distribution µ(z, c). When
agents of ability z in city c devote 1 " #z,c of their time to exchanging ideas, the value of
idea exchange in city c is described by the following:
Zc = m(Mc)zc
Mc = L
!
z"zm
(1 " #z,c)µ(z, c)dz
zc =
!
z"zm
(1 " #z,c)z"z"zm
(1 " #z,c)µ(z, c)dzµ(z, c)dz (3)
This characterization of idea exchanges as mutually beneficial meetings in which each party is
both student and teacher follows Lucas (1988). The matching process that yields exchanges
16The expression for zc in equation (3) is not well defined when !z,c = 1 for all agents. We will definezc = 0 for the case in which no one invests in learning. This is not an average, of course, but it seems anappropriate definition that reflects the absence of opportunities to learn from others. The particular (finite)value assigned to zc when !z,c = 1 for all agents is immaterial, since m(0) = 0 and therefore Zc = 0.
12
means that a city’s population size and average ability both matter.
For an individual worker, the optimal time spent interacting is
1 " #z,c =
#12
AZcz#1AZcz if AZcz $ 1
0 otherwise.
Conditional on the population in city c, which is described by the ability distribution
µ(z, c), the equilibrium value of local idea exchanges Zc is a fixed point defined by Zc =
m(Mc)zc, since individual choices of #z,c, which determine Mc and zc, depend on the city-
level Zc.
Of course, there is also an equilibrium in which Zc = 0, since no individual will allocate
time to interacting with others when there are no others with whom to interact. While the
no-learning equilibrium will not be the focus of our discussion, it does illustrate an important
aspect of the economic mechanisms. It underscores the fact that learning here is not manna
from heaven but the outcome of a costly allocation of time by those acquiring knowledge.
Thus, larger cities are better learning environments because, in equilibrium, they o!er a
higher frequency of face-to-face interactions with a more talented population of partners, as
we show below.
An individual allocates her time in order to maximize her income, so she solves the
maximization problem described in equation (2). The tradable output of type z in city c
with learning opportunities Zc is
z(z, Zc) =
#1
4AZc
$AZcz + 1
%2if AZcz $ 1
z otherwise. (4)
We have a few key conclusions. Tradables producers choose to engage other producers in
encounters from which they both learn. This learning takes time away from direct production
but maximizes their total output by raising their productivity. Time devoted to learning
by a tradables producer is increasing in the time devoted to idea exchange by others, the
scope for productivity gains from idea exchange, the average quality of other learners in that
location, and the producer’s own ability. Given this knowledge economy, we now characterize
the patterns of economic outcomes in spatial equilibrium.
13
2.3 Equilibrium
This section develops the conditions for equilibrium in our spatial knowledge economy. Con-
sumers optimally choose their city, occupation, and consumption. Tradables producers opti-
mally allocate their time between direct production and idea exchange. Prices clear markets
and the individual locational choices must be consistent with aggregate population mea-
sures. There are three types of equilibria: equilibria without idea exchange, equilibria with
symmetric cities, and equilibria with heterogeneous cities. The latter are stable, match
many empirical findings in the systems of cities literature, and will be our primary object of
interest.
An equilibrium for a population L with talent distribution µ(z) in a set of locations
{c} is a set of prices {ph,c, ps,c} and locational choices µ(z, c) such that workers optimize
and markets clear.17 Define the set of cities in which agents of ability z are found by
C(z) = {c : µ(z, c) > 0}. We can then write our equilibrium conditions as equations (5)
through (14).
Equations (5) and (6) are adding-up constraints for worker types and city populations.
µ(z) =&
c
µ(z, c) %z (5)
Lc = L
!µ(z, c)dz %c (6)
Equation (7) defines the land-market-clearing housing price within each city.
ph,c = !L!c %c (7)
Equation (8) equalizes demand and supply of non-tradable services within each location.
Ls,c = L
!
z$zm
µ(z, c)dz = sLc %c (8)
The tradables market clears by Walras’ Law.
Equation (9) characterizes the value of potential idea exchanges in each city, Zc, which
depends on scale (Mc) and average ability (zc). Equation (10) characterizes the latter, the
17In this exposition, we define equilibrium where each member of the set {c} is populated, Lc > 0. Inappendix section A.2, we describe how the number of populated locations is endogenously determined whenthere are many potential city locations, not all of which must be populated.
14
time-weighted average ability of learners in each city.
Zc = m(Mc)zc = m'L
!
z"zm
(1 " #z,c)µ(z, c)dz)(zc %c (9)
zc =
# "z"zm
(1#"z,c)zRz!zm
(1#"z,c)µ(z,c)dzµ(z, c)dz if Mc > 0
0 otherwise%c (10)
Equations (11) through (14) describe agents’ optimal choices. Equation (11) says that
tradables producers allocate their time optimally between directly producing and exchanging
ideas. Equation (12) says that agents choose their occupations optimally so that the marginal
producer is indi!erent between the two sectors.
#z,c = arg max"![0,1]
#z(1 + (1 " #)AZcz) %z $ zm %c (11)
z(zm, Zc) = ps,c %c & C(zm) (12)
Equations (13) and (14) describe the prices consistent with spatial equilibrium. Equation
(13) says that non-tradables producers’ expenditure on tradables, which is their net income
after purchasing non-tradable services and housing, is equal across locations. Equation (14)
means that tradables producers are located in their most-preferred place.
(1 " s)ps,c " ph,c = (1 " s)ps,c" " ph,c" %c, c% (13)
C(z) = arg maxc
z(z, Zc) " sps,c " ph,c %z $ zm (14)
There are three classes of equilibria that satisfy equations (5) through (14): no-learning
equilibria in which all cities have identical aggregate characteristics; learning equilibria in
which some or all cities have identical aggregate characteristics; and learning equilibria with
heterogeneous cities.
In no-learning equilibria, no tradables producer devotes time to idea exchange because no
other tradables producer does, and Zc = 0 %c. Since z(z, 0) does not vary across locations, all
cities in which tradables are produced must have same prices for housing and non-tradables
to satisfy the spatial no-arbitrage conditions (13) and (14). By equation (7), therefore, all
populated cities are the same size.
The no-learning equilibrium is not of interest for two reasons. First, it is empirically ir-
relevant. There is considerable and systematic variation in cities’ populations. Second, it is
15
not a stable equilibrium. Since exchanging ideas is a Pareto improvement (it raises produc-
tivity for all learners without lowering the productivity of any other agent), communication
or coordination among (a su"ciently large set of) tradables producers would facilitate its
choice.
The second type of equilibria are those in which learning occurs and some cities’ aggregate
characteristics are identical. Suppose that Lc = Lc" . Then, by equations (7) and (13),
housing prices and non-tradables prices are equal in these locations. Equation (14) requires
that Zc = Zc" . These cities are therefore identical in their populations, prices, and learning
opportunities.
Learning equilibria with symmetric cities are not of interest for the two reasons the no-
learning equilibrium is not. First, they are empirically irrelevant. Second, they are not stable.
When two cities’ learning environments di!er at all, higher-ability tradables producers are
drawn to the better learning environment. Thus, their movement reinforces initial di!erences
in learning opportunities and moves the system of cities towards the asymmetric equilibrium.
See appendix section A.3 for details.
Finally, there are equilibria with heterogeneous cities, our object of interest. Equilibria
with heterogeneous cities exhibit cross-city patterns that can be established independent of
the number of cities that arise.18 Equations (5) through (14) jointly imply that larger cities
have higher housing prices, higher non-tradables prices, exhibit better learning opportunities,
and are populated by more talented tradables producers.
To understand this logic, suppose that Zc varies across cities. Housing and service prices
(ph,c + ps,cs) must be higher in locations with higher Zc, lest all tradables producers prefer
those locations, in accordance with equation (14) and violating equation (8). Equation (13)
requires that locations with greater Zc and therefore greater ph,c + ps,cs have higher ps,c so
that non-tradables producers earn higher nominal incomes in locations with higher prices.
If ph,c were not higher in cities with higher ps,c, then these locations would attract all non-
tradables producers, so cities with better learning opportunities (Zc) have both higher ps,c
and higher ph,c, which means that they are more populous, by equation (7).
Because z(z, Zc) is supermodular in z and Zc, higher-z tradables producers gain more
from locating in high-Zc locations with higher prices. As a result, tradables producers sort
across cities in equilibria with heterogeneous cities. This sorting according to ability supports
18Since these patterns characterize all equilibria with heterogeneous cities, we do not address issues ofuniqueness.
16
equilibrium di!erences in Zc.19
If there are n locations with positive population and we label the cities by their size so
that L1 < L2 < · · · < Ln, then the stable-equilibria correspondence C(z) is increasing and
lower hemicontinuous for all z > zm. It is single-valued for all z > zm except for n " 1
“boundary values” of z, who are indi!erent between cities c and c + 1 because the benefit
from Zc+1 > Zc is o!set by the higher prices in c + 1.
We provide su"cient conditions for the existence of this equilibrium when n = 2 in
appendix section A.4. In short, the existence of an equilibrium with two heterogeneous
cities requires that congestion costs are su"ciently strong so that not everyone will locate
in a single city in equilibrium and that the potential productivity gains from idea exchanges
are su"ciently high that all tradables producers in the larger city will spend time learning
in equilibrium.
These equilibria with heterogeneous cities, our object of interest, are robust to perturba-
tion.20 They match the fundamental facts that cities di!er in size and these size di!erences
are accompanied by di!erences in wages, housing prices, and productivity (Glaeser, 2008).
Empirically, larger cities exhibit higher nominal wages in industries that produce tradable
goods, which means that productivity is higher in these locations (Moretti, 2011). Our
model of why larger cities generate more productivity-increasing idea exchanges is a micro-
founded explanation of these phenomena. Having matched these well-established facts, we
now describe the novel empirical implication that skill premia will be higher in larger cities.
2.4 The spatial pattern of skill premia
When cities are heterogeneous, equations (5) through (14) jointly imply that larger cities
have higher housing prices, higher non-tradables prices, exhibit better learning opportunities,
and are populated by more talented tradables producers. They also imply that skill premia
are higher in larger cities. Appendix section A.5 formally derives this prediction for a two-
city equilibrium when ability is distributed Pareto or uniform. This section uses numerical
examples to illustrate the economic mechanisms and logic of the novel prediction.
Figure 2 shows the nominal wage and utility outcomes for a particular parameterization
19Any microfoundations for Zc in which cities with a larger mass of more-talented tradables producersexhibit a higher endogenous value of Zc will support a sorting outcome. Supermodularity of z(z, Zc) issu"cient for sorting among tradables producers, since the prices they face do not vary with z.
20Applying the dynamic extension presented in appendix section A.3 shows that a small perturbation tothe asymmetric equilibrium will yield sorting that converges back to the same asymmetric equilibrium.
17
of our model in a two-city equilibrium.21 Worker ability, indexed by z, appears on the
horizontal axis. We assume here that ability is uniformly distributed. Since the spatial
allocation of non-tradables producers (z < zm) is indeterminate due to indi!erence, we order
them by ability only for ease of illustration.22 Tradables producers (z > zm) are sorted
according to ability because this maximizes their utility. zb is the ability of the tradables
producer who is indi!erent between the two cities. Since ability is uniformly distributed, the
width of the interval is proportional to city population.
Figure 2: Two-city equilibrium: Wages and utility
10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1.8
00.20.40.60.81
1.21.41.6
Ability (z)
zm
zb
Nominal wage
Utility
Ls,1 Ls,2 Lt,1 Lt,2
The nominal wages of both tradables and non-tradables producers are higher in larger
cities. This matches the well-established empirical literature on the urban wage premium
(Glaeser and Mare, 2001; Glaeser and Gottlieb, 2009). For non-tradables producers, higher
nominal wages in larger cities may be thought of as compensation for higher housing prices
that keeps real wages constant across cities.
Tradables producers’ wages are higher in larger cities for three reasons. First, there is a
compositional e!ect. Since there is spatial sorting among tradables producers, those in larger
cities have higher innate abilities that generate higher incomes in any location. Second, there
is a learning e!ect. Since larger cities provide more valuable learning opportunities, idea
exchanges in larger cities yield larger productivity gains and thus higher nominal incomes
for tradables producers. Third, there is a compensation e!ect. Producers who are indi!erent
at the margin between two cities must have a wage gap that exactly matches the gap in non-
21See appendix section B for details of this parameterization.22See appendix section A.4 for the formal definition of this µ(z, c).
18
tradables and housing prices between those cities. Among the skilled tradables producers,
this is a measure zero set that defines the boundary ability level zb. Among the unskilled
non-tradables producers, the compensation e!ect applies to all individuals, because their
homogeneity of productivity makes them all indi!erent across cities. Since higher-ability
agents earn higher incomes, the nominal wage di!erence between the two cities is a larger
proportion of the non-tradables producers’ incomes than that of the marginal tradables
producer zb.23
What do these outcomes imply for the spatial pattern of skill premia? We define a city’s
observed skill premium as its average tradables wage divided by its (average) non-tradables
wage ps,c.
wc
ps,c=
Rz!zm
zc(z)µ(z,c)dzR
z!zmµ(z,c)dz
ps,c
In equilibria with heterogeneous cities, the cross-city pattern of skill premia depends
upon the compositional, learning, and compensation e!ects. The compositional and learning
e!ects yield higher nominal incomes for tradables producers in the larger city and a!ect all
tradables producers. Each of these e!ects raises the skill premium of the larger city relative to
the smaller city. The compensation e!ect lowers the skill premium in the larger city relative
to the smaller city. When the compositional and learning e!ects dominate the compensation
e!ect, the skill premium is higher in the larger city.
Figure 3 illustrates the pattern of wage premia for a four-city example.24 It compares
the incomes of tradables and non-tradables producers by placing the wage schedules on
a common horizontal axis. The ratio of the wage schedules gives the skill premium of
each tradables producer relative to the non-tradables producers in the same location. The
observed skill premium is the average of these observations in each location. The skill premia
curve steps down at the boundaries where tradables producers are indi!erent between two
locations, due to the compensation e!ect. The figure illustrates how the compositional and
learning e!ects that raise the skill premium, due to the di!erences in inframarginal tradables
23This compensation e!ect, which stems from non-homothetic preferences in which lower-income indi-viduals spend a larger fraction of their budget on non-tradables, is the basis for the prediction of Black,Kolesnikova, and Taylor (2009) that skill premia will be lower in cities with higher housing prices. It cannotexplain why skill premia are higher in larger cities, since larger cities generally have higher housing prices.
24See appendix section B for the parameter values underlying this example. Interval widths are propor-tionate to city populations.
19
producers’ abilities and the di!erences in the productivity gains arising from idea exchanges,
are greater than the compensation e!ect that lowers the skill premium. Here larger cities
exhibit higher skill premia.
Figure 3: Four-city equilibrium: Skill premia
10.5 0.6 0.7 0.8 0.9
2.5
0.5
1
1.5
2
Tradables producer ability (z)
Unskilled wage
Skilled wageSkill premium
L1 L2 L3 L4
We can state the condition formally for a two-city asymmetric equilibrium. The skill
premium is higher in the larger city when:
R #zb
z2(z)µ(z)dzR #
zbµ(z)dz
ps,2=
w2
ps,2>
w1
ps,1=
R zbzm
z1(z)µ(z)dzR zb
zmµ(z)dz
ps,1
The equilibrium pattern of skill premia depends on the distribution of abilities, µ(z). In
appendix section A.5, we show that this condition for skill premia to increase with city size
holds true in the two-city case for the Pareto distribution and provide su"cient conditions
for this inequality to hold for the uniform distribution.
To study the pattern of cross-city wage premia predicted by our model with an arbitrary
number of cities, we used numerical optimization to search the parameter values minimizing
the correlation between city sizes and skill premia when z ' U(0, 1) for equilibria with more
than two cities. The numerical results suggest that the premia-size correlation is minimized
by letting s ( 1 so that the mass of inframarginal tradables producers shrinks to zero and
the relative influence of the compensation e!ect is maximized. We did not find a set of
parameter values yielding an equilibrium in which the observed skill premia were not strictly
20
increasing in city population. The prediction that skill premia are higher in larger cities
appears to be a robust feature of our model.
2.5 Outsourcing and migration in spatial equilibrium
In this section, we develop a model that explains key facts developed in section 1.3, notably
that skilled workers move more often and farther than unskilled workers. The challenge is to
explain the di!erential movement of skilled and unskilled although they are both perfectly
mobile. We do this in two steps.
The first step brings our model closer to an important feature of the data. Thus far,
we have abstracted from the fact that larger cities tend to have a higher ratio of skilled
to unskilled workers. We address this by introducing an additional task carried out by the
unskilled, assembly of final tradable output, which can be carried out locally or outsourced.
Producers in larger cities, where unskilled nominal wages are higher, outsource assembly
tasks. Outsourcing makes larger cities exhibit a higher skilled to unskilled ratio and will
enrich our model of migration.25
With this model of outsourcing in hand, our second step is to introduce a formal model
of migration of skilled and unskilled workers. In this model, the skilled will move both more
often and a greater distance on average than the unskilled even though both are perfectly
mobile. The intuition is simple. Skilled workers have a most-preferred city that best rewards
their skill, so they choose to move there. This gives rise to long-distance moves for many
of the skilled and simultaneous outflows and inflows of skilled workers from the same city.
The unskilled receive the same utility in all cities, so those who move only need seek the
nearest city with notional excess demand for their labor. Di!erential mobility and di!erential
movement are not the same. The empirical observation of di!erential migration rates can
be accounted for in a spatial-equilibrium framework.
2.5.1 Outsourcing
To this point, we have production of a single, competitively produced homogeneous tradable
good, so that this notionally tradable good is not in fact traded. We now introduce a richer
25If we take the limit as outsourcing goes to zero, our simple dynamic migration model has only migrationof the skilled. Obviously this implies that the skilled would move more than the unskilled. With outsourcingand the empirically relevant cross-city heterogeneity in skill composition, members of both labor groupsmigrate. Thus our result that the skilled move more often is shown to hold in this more realistic setting.
21
model of tradables production that amends this in interesting ways. Heretofore, tradables
producers have been unable to fragment their production process across locations because
each producer is self-employed in her residential location. Self-employment also collapses
the distinction between a worker and a firm. Assuming that all of a firm’s activities take
place in a single location is plausible when all elements of production depend upon face-to-
face information exchange. But new communication technologies increasingly facilitate the
separation of knowledge-intensive headquarters activities from some production-plant-level
activities.26
We thus amend tradables production to add a second task, assembly of the output.
Assembly requires l < 1 " s units of homogeneous labor of the same type that produces
non-tradable services. Each tradables producer z $ zm must therefore incur the additional
production cost l · ps,c when assembling output in city c.27
Tradables firms may pay a fixed cost fa to establish an assembly plant using homogeneous
labor in a location other than where the firm is headquartered. The net benefit to a tradables
producer located in city c of outsourcing assembly to city c% is l(ps,c " ps,c") " fa. Thus,
tradables producers in a large, high-z city will outsource assembly to the smaller, lower-z
city if the gap in assembly prices is su"ciently large. Such outsourcing raises the relative skill
level of larger cities by shifting unskilled assembly activities to smaller cities and attracting
more tradables producers to larger cities to benefit from idea exchange. Thus, larger cities
become sites of human-capital-intensive activities that are home to more skilled populations.
To formalize this, we define an assembly assignment function $(c, c%), which describes the
fraction of assembly tasks for firms headquartered in c that are performed in c%.28 The equi-
librium conditions for tradables production require that the assembly location assignments
minimize assembly costs, labor markets clear, and that tradables producers take assembly
26Duranton and Puga (2005) study the fragmentation of production in a model with homogeneous workersand exogenously assigned occupations. Fujita and Thisse (2006) study how trade costs and communicationcosts determine this fragmentation of production in a setting with homogeneous firms and two worker types.
27In assuming that each tradables producer requires l units for assembly regardless of z(z, Zc), we areassuming that the greater revenues accruing to higher-z producers are due to selling higher-quality productsrather than greater quantities. This is the simplest conceivable assembly process.
28The optimal choice of assembly location is orthogonal to producer ability, so "(c, c") is independent ofz. While the location of production is discretely chosen by each tradables producer, there is a continuum ofthem, so " may be a fraction.
22
costs into account when choosing their headquarters location.
{$(c, c%)} & arg min{#(c,x)}
l&
x
$(c, x)ps,x "&
x&=c
$(c, x)fa s.t.&
x
$(c, x) = 1 (15)
Ls,c = L
!
z$zm
µ(z, c)dz = sLc + l&
c"
$(c%, c)L
!
z"zm
µ(z, c%)dz (8’)
C(z) = arg maxc
z(z, Zc) " sps,c " ph,c " l&
c"
$(c, c%)ps,c" "&
c" &=c
$(c, c%)fa %z $ zm (14’)
The rest of the equilibrium conditions are unchanged.
In the absence of outsourcing, the skilled share of each city’s population was 1 " s and
therefore independent of city size. When tradables production incorporates a task that may
be outsourced and does not benefit from physical proximity to better learning opportunities,
larger cities with higher local prices will outsource those tasks to smaller cities with lower
local prices. This means that the skilled population share is increasing in city size, matching
the empirical tendency. The strength of this relationship depends on the fragmentation cost
fa and the relative magnitudes of l and s.
Many believe that fragmentation costs have fallen substantially in recent decades (Du-
ranton and Puga, 2005). Our model predicts that this will trigger outsourcing of assembly
tasks and generate a positive correlation between cities’ skilled population share and total
population. Since larger cities exhibit higher nominal skill premia in our model, outsourcing
generates a positive correlation between skill premia and skill shares.
2.5.2 Migration in the outsourcing model
As noted earlier, influential models in the spatial literature assume that skilled workers are
mobile while unskilled workers are not, justifying this on the basis that empirically skilled
workers move more frequently than unskilled workers. Thus the challenge we take up in this
section is to develop a very simple dynamic extension in which all agents are (essentially)
perfectly mobile but skilled workers move more frequently than unskilled workers. With
modest additional assumptions, we can develop an additional prediction. Not only will
skilled workers move more frequently than unskilled workers, but they will also typically
move a greater distance, a prediction that also finds support in the data.
Consider first a 2-city system with spatial sorting and outsourcing of unskilled assembly
as described above, in which city 2 is larger and more skilled. Assume that each period a
23
fraction % of the population in each city simultaneously gives birth to a succeeding child
and dies. There is no saving, accumulation of capital, or other intertemporal economic
interaction, and the total population size is time-invariant. A fraction 1"& of the newborns
inherit the same z as their parent and a fraction & of the newborns have their type distributed
according to µ(z). This makes the aggregate talent distribution time-invariant and so the
full equilibrium is unchanged. We assume there are positive but arbitrarily small costs of
movement, so that gross migration is the minimum necessary to achieve the equilibrium
population allocation.
Will agents migrate? Consider first those born in city 2, the relatively “skilled city.”
Newborns with talent z & (zb, 1) will stay in the skilled city. Newborns with talent z &(zm, zb) will migrate to the unskilled city. Some agents with talent z ) zm (newborn or not)
have reason to migrate to the unskilled city because the larger city’s outsourcing-induced
lower unskilled share (Ls,2
L2< Ls,1
L1) means that the fraction of newborns with talent z ) zm
there exceeds the equilibrium fraction. Conversely, there will be net migration of tradables
producers to the skilled city from the unskilled city.
Gross skilled (z $ zm) migration exceeds net skilled migration, which equals gross un-
skilled (z ) zm) migration. This therefore provides an endogenous, economic reason for the
greater movement of more-educated workers. There are two-way flows of skilled workers and
a one-way flow of unskilled workers, with the net flows of the skilled matching the gross
flows of the unskilled. Provided that less than half the population is skilled, this matches
the empirical regularity that more skilled workers move more frequently as a consequence
of the equilibrium allocation of talent, rather than an assumption that less-skilled workers
are immobile. It also matches empirical work suggesting that movement reflects di!erential
returns to skills (Borjas, Bronars, and Trejo, 1992; Dahl, 2002).
This insight generalizes to an n-city setting, and the proof is simple. The economy-wide
skill distribution is invariant across time, so any initial equilibrium is also an equilibrium in
later periods. We focus on this equilibrium. For each city, there will be a mismatch between
the %&Lc newborns each period whose characteristics are orthogonal to those of their parents
and their parents’ characteristics. This di!erence represents the net migration o!er of city
c to all other cities in the system. Note that newborns whose z determines they will work
as skilled workers in tradables have (except for a measure zero set) a unique city to which
they must move, while as of yet we have not determined the exact patterns of flows of the
unskilled, although we consider the case of arbitrarily small costs of migration to rule out
24
cross-hauling of unskilled migrants.
It is convenient to define two groups of cities. Let CX be the set of cities that are exporters
of unskilled migrants and CM be the set of cities that are importers of unskilled migrants
(gross and net being the same due to the absence of cross-hauling of the unskilled). Since
each individual city has zero change in total population, that is also true of any partition
of the set of cities. Thus exports of unskilled migrants from CX to CM must be exactly
matched by net imports of skilled migrants in the reverse direction.
Note that all exports of the unskilled must move from CX to CM as a matter of definition.
But these cannot be the only exports of migrants from CX to CM ; there are also skilled
workers unique to cities in CM who travel that direction. Thus exports of workers from
CX to CM are comprised of all the unskilled who move plus some skilled. The volume of
exports the reverse direction, by balanced migration, must equal this sum. All of these are
skilled. Thus we already have that the majority of migrants between cities in CX and CM
are skilled. We need to add in the migrants among the cities of CX and CM , respectively. All
of these are skilled as well, since the arbitrarily small migration costs prevent cross-hauling
of unskilled migrants. Hence, we can claim, a fortiori, that the skilled will be the majority
of migrants. So long as the skilled are less than half the labor force, this su"ces to show
that the fraction of migrants is higher among the skilled than unskilled, the first fact that
we wanted to explain.
Moreover, the n-city framework also allows us to make a novel prediction – not only will
the skilled move more often but they will typically move a greater distance. Again, the logic
is simple. With arbitrarily small positive trade costs, the skilled move to their most preferred
city. Movements of the unskilled can be considered the solution to a linear programming
problem that minimizes the total distance moved of the unskilled while matching net o!ers
of unskilled by cities (cf. Dorfman, Samuelson, and Solow 1958). Appendix section A.6
formalizes the result that skilled individuals will migrate greater distances on average.
The model of migration we have developed here is surely special in a number of di-
mensions. This notwithstanding, we believe that there is a deeper logic at work here that
is consistent with our story of spatial equilibrium.29 Skilled workers find employment in
tasks that make considerable use of heterogeneity, which motivates more spatially extensive
29An older, empirical literature on di!erential migration by education suggested broader geographic labormarkets for higher skilled workers without explaining their economic foundations (Long, 1973; Frey, 1979;Frey and Liaw, 2005).
25
searches. Less skilled workers find employment in tasks that make little use of heterogeneity,
which induces less spatial searching. Both the frequency and distance of moves reflect this.
3 Conclusion
The productivity of modern cities depends crucially on their role as loci for idea exchange.
Consideration of idea exchange naturally invites an examination of labor heterogeneity, since
this a!ects both the opportunities to learn and the capability of learners. Everyone would
like to be where learning opportunities are greatest. But the best learners are those most able
to take advantage of these opportunities and so most willing to pay for them. In our model,
the broad desire to access the best learning opportunities induces di!erences in city sizes and
housing prices. The higher willingness of the most skilled to pay these prices induces sorting
among learners. In larger cities, they exchange ideas more frequently with more people whose
average ability is higher. Individuals producing non-tradables don’t participate in these idea
exchanges, but they are drawn to larger cities by higher nominal wages, which compensate
them for higher housing prices. Marginal tradables producers in larger cities also receive
this nominal wage compensation for higher prices. But wages are also higher in larger cities
for skilled workers because they have higher abilities and spend more time exchanging ideas
that further raise their productivity. These combined e!ects insure that the skill premium
also rises with city size.
This paper presents the first system of cities model in which costly idea exchange is the
agglomeration force. Our emphasis on the costly and optimal allocation of e!ort to idea
exchange is designed to overcome the “black box” critique that has inhibited research in this
crucial area. An important payo! is that we provide the first spatial-equilibrium account of
why the skill premium is rising with city size. This is a first-order, robust feature of the data
that does not emerge from the traditional neoclassical frameworks. We also provide the first
spatial-equilibrium account of how variation in skill premia may arise from symmetric spatial
fundamentals. We do this in a framework that also replicates a broad set of established facts
about the cross section of cities.
We derive these results in a very parsimonious framework. Labor is the sole factor of
production and is heterogeneous in a single dimension. There are two goods, tradables and
non-tradables. Housing acts as a simple dispersion force. Idea exchanges are local and
depend on the scale and average ability of learners. These few assumptions cause cities
26
to vary in size and larger cities to have better learning environments, higher wages, higher
productivity, higher housing prices, and higher skill premia – all prominent features in the
data.
We extend the model to consider outsourcing and cross-city migration. These are impor-
tant phenomena in their own right. They also allow us to provide an endogenous, spatial-
equilibrium explanation for the di!erential movement of skilled and unskilled workers. This
contrast with previous work highlights the sharp distinction between di!erential mobility
as an assumption and di!erential movement as an outcome of economic choices. Long-run
models of the spatial distribution of economic activity should take the latter path.
Our approach is quite flexible and invites a number of extensions. One would be to
introduce a richer model of the internal structure of the city. A second would be to consider
both the incentives for information exchange and some of the disincentives, given exchanges
may also be with competitors. A third would be to integrate our model with both new
economic geography concerns with product di!erentiation and imperfect competition, as
well as the literature on labor market pooling. We are confident that hybrids will provide
interesting perspectives on the interaction of elements from the respective frameworks.
27
References
Abdel-Rahman, H. M., and A. Anas (2004): “Theories of systems of cities,” in Handbook
of Regional and Urban Economics, ed. by J. V. Henderson, and J. F. Thisse, vol. 4, chap. 52,
pp. 2293–2339. Elsevier.
Acemoglu, D., and D. Autor (2011): “Skills, Tasks and Technologies: Implications for
Employment and Earnings,” in Handbook of Labor Economics, ed. by O. Ashenfelter, and
D. Card, vol. 4, pp. 1043–1171. Elsevier.
Albouy, D. (2009): “The Unequal Geographic Burden of Federal Taxation,” Journal of
Political Economy, 117(4), 635–667.
Alonso, W. (1964): Location and land use: Toward a general theory of land rent, Publica-
tion of the Joint Center for Urban Studies. Harvard University Press.
Arzaghi, M., and J. V. Henderson (2008): “Networking o! Madison Avenue,” Review
of Economic Studies, 75(4), 1011–1038.
Audretsch, D. B., and M. P. Feldman (2004): “Knowledge spillovers and the geogra-
phy of innovation,” in Handbook of Regional and Urban Economics, ed. by J. V. Henderson,
and J. F. Thisse, vol. 4, chap. 61, pp. 2713–2739. Elsevier.
Autor, D. H., and D. Dorn (2012): “The Growth of Low Skill Service Jobs and the
Polarization of the U.S. Labor Market,” MIT working paper.
Autor, D. H., D. Dorn, and G. Hanson (2011): “The China Syndrome: Local Labor
Market E!ects of Import Competition in the United States,” MIT working paper.
Bacolod, M., B. S. Blum, and W. C. Strange (2009): “Skills in the city,” Journal of
Urban Economics, 65(2), 136–153.
Baum-Snow, N., and R. Pavan (2011): “Inequality and City Size,” mimeo.
Beaudry, P., M. Doms, and E. Lewis (2010): “Should the Personal Computer Be
Considered a Technological Revolution? Evidence from U.S. Metropolitan Areas,” Journal
of Political Economy, 118(5), 988 – 1036.
28
Behrens, K., G. Duranton, and F. Robert-Nicoud (2010): “Productive cities: Sort-
ing, selection and agglomeration,” CEPR Discussion Paper 7922, CEPR.
Behrens, K., and F. Robert-Nicoud (2011): “Survival of the Fittest in Cities: Urban-
isation, Agglomeration, and Inequality,” mimeo.
Berliant, M., and M. Fujita (2008): “Knowledge Creation As A Square Dance On The
Hilbert Cube,” International Economic Review, 49(4), 1251–1295.
Berliant, M., R. R. Reed III, and P. Wang (2006): “Knowledge exchange, matching,
and agglomeration,” Journal of Urban Economics, 60(1), 69–95.
Black, D. (1999): “Local knowledge spillovers and inequality,” mimeo.
Black, D., N. Kolesnikova, and L. Taylor (2009): “Earnings Functions When Wages
and Prices Vary by Location,” Journal of Labor Economics, 27(1), 21–47.
Borck, R., M. Pfluger, and M. Wrede (2010): “A simple theory of industry location
and residence choice,” Journal of Economic Geography, 10(6), 913–940.
Borjas, G. J., S. G. Bronars, and S. J. Trejo (1992): “Self-selection and internal
migration in the United States,” Journal of Urban Economics, 32(2), 159–185.
Charlot, S., and G. Duranton (2004): “Communication externalities in cities,” Journal
of Urban Economics, 56(3), 581–613.
Combes, P.-P., G. Duranton, and L. Gobillon (2008): “Spatial wage disparities:
Sorting matters!,” Journal of Urban Economics, 63(2), 723–742.
Dahl, G. B. (2002): “Mobility and the Return to Education: Testing a Roy Model with
Multiple Markets,” Econometrica, 70(6), 2367–2420.
De la Roca, J. (2012): “Selection in initial and return migration: Evidence from moves
across Spanish cities,” mimeo.
Dorfman, R., P. Samuelson, and R. Solow (1958): Linear programming and economic
analysis. McGraw-Hill.
Duranton, G., and D. Puga (2005): “From sectoral to functional urban specialisation,”
Journal of Urban Economics, 57, 343–370.
29
Eeckhout, J., R. Pinheiro, and K. Schmidheiny (2010): “Spatial Sorting: Why New
York, Los Angeles and Detroit attract the greatest minds as well as the unskilled,” CEPR
Discussion Paper 8151.
Frey, W. H. (1979): “The Changing Impact of White Migration on the Population Com-
positions of Origin and Destination Metropolitan Areas,” Demography, 16(2), pp. 219–237.
Frey, W. H., and K.-L. Liaw (2005): “Migration within the United States: Role of
Race-Ethnicity,” Brookings-Wharton Papers on Urban A!airs, pp. 207–262.
Fujita, M., P. Krugman, and A. J. Venables (1999): The Spatial Economy: Cities,
Regions, and International Trade. MIT Press.
Fujita, M., and J.-F. Thisse (2006): “Globalization And The Evolution Of The Supply
Chain: Who Gains And Who Loses?,” International Economic Review, 47(3), 811–836.
Gabaix, X. (1999): “Zipf’s Law For Cities: An Explanation,” The Quarterly Journal of
Economics, 114(3), 739–767.
Gibbons, S., H. G. Overman, and P. Pelkonen (2010): “Wage Disparities in Britain:
People or Place?,” SERC Discussion Papers 0060, Spatial Economics Research Centre,
LSE.
Glaeser, E. L. (1999): “Learning in Cities,” Journal of Urban Economics, 46(2), 254–277.
(2005): “Urban colossus: Why is New York America’s largest city?,” Economic
Policy Review, 11(2), 7–24.
(2008): Cities, Agglomeration, and Spatial Equilibrium, The Lindahl Lectures.
Oxford University Press.
Glaeser, E. L., and J. D. Gottlieb (2009): “The Wealth of Cities: Agglomeration
Economies and Spatial Equilibrium in the United States,” Journal of Economic Literature,
47(4), 983–1028.
Glaeser, E. L., J. Gyourko, and R. E. Saks (2006): “Urban growth and housing
supply,” Journal of Economic Geography, 6(1), 71–89.
30
Glaeser, E. L., and D. C. Mare (2001): “Cities and Skills,” Journal of Labor Economics,
19(2), 316–42.
Glaeser, E. L., M. Resseger, and K. Tobio (2009): “Inequality In Cities,” Journal
of Regional Science, 49(4), 617–646.
Greenwood, M. J. (1997): “Internal migration in developed countries,” in Handbook of
Population and Family Economics, ed. by M. R. Rosenzweig, and O. Stark, vol. 1, chap. 12,
pp. 647–720. Elsevier.
Gyourko, J., C. Mayer, and T. Sinai (2006): “Superstar Cities,” NBER Working Paper
12355, National Bureau of Economic Research.
Helpman, E. (1998): “The size of regions,” in Topics in Public Economics, ed. by I. Z.
David Pines, Efraim Sadka, pp. 33–54. Cambridge University Press.
Helsley, R. W., and W. C. Strange (2004): “Knowledge barter in cities,” Journal of
Urban Economics, 56(2), 327–345.
Henderson, J. V. (1974): “The Sizes and Types of Cities,” American Economic Review,
64(4), 640–56.
Jaffe, A. B., M. Trajtenberg, and R. Henderson (1993): “Geographic Localization
of Knowledge Spillovers as Evidenced by Patent Citations,” The Quarterly Journal of
Economics, 108(3), 577–98.
Krugman, P. (1991): “Increasing Returns and Economic Geography,” Journal of Political
Economy, 99(3), 483–99.
(2011): “The New Economic Geography, Now Middle-aged,” Regional Studies,
45(1), 1–7.
Long, L. H. (1973): “Migration di!erentials by education and occupation: Trends and
variations,” Demography, 10, 243–258.
Lucas, R. E. (1988): “On the mechanics of economic development,” Journal of Monetary
Economics, 22(1), 3–42.
31
Lucas, R. E., and B. Moll (2011): “Knowledge Growth and the Allocation of Time,”
NBER Working Paper 17495, National Bureau of Economic Research.
Marshall, A. (1890): Principles of Economics. MacMillan and Co.
Mills, E. (1967): “An aggregative model of resource allocation in a metropolitan area,”
The American Economic Review, 57(2), 197–210.
Molloy, R., C. L. Smith, and A. Wozniak (2011): “Internal Migration in the United
States,” Journal of Economic Perspectives, 25(3), 173–96.
Moretti, E. (2011): “Local Labor Markets,” in Handbook of Labor Economics, ed. by
O. Ashenfelter, and D. Card, vol. 4, chap. 14, pp. 1237–1313. Elsevier.
Muth, R. F. (1969): Cities and Housing. University of Chicago Press.
Nakamura, E., and J. Steinsson (2011): “Fiscal Stimulus in a Monetary Union: Ev-
idence from U.S. Regions,” NBER Working Paper 17391, National Bureau of Economic
Research.
Notowidigdo, M. J. (2011): “The Incidence of Local Labor Demand Shocks,” Working
Paper 17167, National Bureau of Economic Research.
Roy, A. (1951): “Some thoughts on the distribution of earnings,” Oxford Economic Papers,
pp. 135–146.
Ruggles, S., J. T. Alexander, K. Genadek, R. Goeken, M. B. Schroeder,
and M. Sobek (2010): “Integrated Public Use Microdata Series: Version 5.0 [Machine-
readable database],” Minneapolis, MN: Minnesota Population Center.
Tabuchi, T., and J.-F. Thisse (2002): “Taste heterogeneity, labor mobility and economic
geography,” Journal of Development Economics, 69(1), 155–177.
32
A Theory
A.1 Internal urban structure
To introduce congestion costs, we follow Behrens, Duranton, and Robert-Nicoud (2010) and
adopt a standard, highly stylized model of cities’ internal structure.30 City residences of unit
size are located on a line and center around a single point where economic activities occur,
called the central business district (CBD). Residents commute to the CBD at a cost that
is denoted in units of the numeraire. The cost of commuting from a distance x is 'x! and
independent of the resident’s income and occupation.
Agents choose a residential location x to minimize the sum of land rent and commuting
cost, r(x) + 'x!. In equilibrium, agents are indi!erent across residential locations. In a city
with population mass L, the rents fulfilling this indi!erence condition are r(x) = r$
L2
%+
'$
L2
%!"'x! for 0 ) x ) L2 . Normalizing rents at the edge to zero yields r(x) = '
$L2
%!"'x!.
The city’s total land rent is
TLR =
! L2
$L2
r(x)dx = 2
! L2
r(x)dx = 2'
)*L
2
+!+1
" 1
" + 1
*L
2
+!+1,
=2'"
" + 1
*L
2
+!+1
The city’s total commuting cost is
TCC = 2
! L2
'x!dx =2'
" + 1
*L
2
+!+1
* !L!+1
The city’s total land rents are lump-sum redistributed equally to all city residents. Since
they each receive TLRL , every resident pays the average commuting cost, TCC
L = !L!, as her
net urban cost. Since this urban cost is proportionate to the average land rent, we say the
“consumer price of housing” in city c is ph,c = !L!c .
A.2 The number of cities
In section 2.3, we defined equilibrium for a set of locations {c} in which each member of
the set is populated, Lc > 0. Here we describe how the equilibrium number of cities is
determined when there are an arbitrary number of potential city sites, some of which are
30There is nothing original in this urban structure. We use notation identical to, and taken from, Behrens,Duranton, and Robert-Nicoud (2010).
33
unpopulated in equilibrium.
Consider a potential city site that is unoccupied. The modern technologies employed
require specialization, so individuals cannot divide their time between producing tradables
and non-tradables. Since non-tradables are a necessity, an individual living in isolation will
produce only non-tradables. Thus, an individual moving to an empty location would engage
in subsistence production of non-tradables, consume free housing, and obtain utility of zero.
Unless all non-tradables producers consume zero tradables (ps,c = ph,c
1#s %c : Lc > 0),
non-tradables producers living in cities obtain strictly positive utility. Therefore the entire
population lives in a finite number of cities. Accommodating an arbitrary set of locations
{c} that includes uninhabited places (Lc = 0) only requires modifying one equation. When
there are potentially empty locations, the spatial-equilibrium indi!erence condition 13 only
applies to occupied locations. The equilibrium condition is thus
(1 " s)ps,c " ph,c = (1 " s)ps,c" " ph,c" %c, c% : Lc > 0, Lc" > 0 (13’)
There may be multiple equilibria satisfying equations (5) through (12), (13’), and (14)
that have di!erent numbers of cities. We see no theoretical reason to believe that the
equilibrium number of populated cities should be unique for a given set of parameters. The
qualitative, cross-city predictions of the model do not depend upon the equilibrium number
of cities. The particulars of our numerical examples do, of course.
When there are tradables producers who do not spend time learning, it is welfare-
maximizing for the population of non-learning tradables producers and a corresponding
fraction of the non-tradables producers to reside in every potential location, since this min-
imizes congestion costs and there are no agglomeration benefits for non-learners. Particular
assumptions about city developers, a la Henderson (1974), would ensure that this outcome
would occur in equilibrium, but we do not consider these issues to be crucial to the topics
we explore in this paper.
A.3 Stability of equilibria
In this section, we describe the instability of equilibria in which there are symmetric cities
with the same population size and learning opportunities. For simplicity, consider a two-
city symmetric equilibrium in which initially L1 = L2 and Z1 = Z2 > 0. Since sorting
among tradables producers distinguishes equilibria with symmetric cities from equilibria
34
with heterogeneous cities and since in all equilibria non-tradables producers are indi!erent
across all locations, we simplify the discussion by assuming that s1#s individuals of ability
z < zm move locations whenever individuals of ability z > zm move locations. We now
introduce a simple dynamic process for the locational choices of tradables producers that
demonstrates that equilibria with symmetric cities are unstable.
Suppose that there is a shock to the symmetric equilibrium such that L1 += L2 and
Z1 += Z2 but we are in the neighborhood of L1 = L2 and Z1 = Z2. Without loss of generality,
let Z2 > Z1. By the supermodularity of z(z, Zc), the net benefit of locating in city 2 relative
to city 1 (z(z, Z2) " z(z, Z1) " $1#s(L
!2 " L!
1)) is increasing in z.
Individuals move according to the myopic net benefits of changing locations. The resi-
dents of each city have the opportunity to move, and we alternate between the two cities ad
infinitum. Without loss of generality, individuals located in city 1 consider moving, followed
by individuals located in city 2, followed by individuals now located in city 1, and so forth.
We now assume that locational changes are ordered according to the net benefits of changing
locations. That is, the tradables producers who have the most to gain from moving move
first. Tradables producers take full account of the changes in cities’ economic characteristics
induced by those who have moved before them and are completely myopic with respect to
future changes.
Suppose that the highest ability tradables producer in city 1 has a positive net benefit
of moving to city 2.31 By supermodularity of z(z, Zc), this producer has the most to gain
by relocating to city 2. Since we start from a symmetric equilibrium, this move raises Z2
and L2 and lowers Z1 and L1.32 The ordering of the net benefit of locating in city 2 relative
to city 1 is unchanged by these outcomes. If the net benefit is positive for the (remaining)
highest ability tradables producer located in city 1, that producer relocates. This process
continues until the net benefit for the highest ability tradables producer is zero.33 At this
point, all the individuals located in city 1 wish to remain in city 1. We know that L2 > L1
because individuals have moved from city 1 to city 2, and we know that Z2 > Z1 because
the net benefit is zero for the last mover.
Next, individuals in city 2 have the opportunity to relocate to city 1. If there is a tradables
31If not, then no individual in city 1 moves. In the next step, individuals in city 2 have the opportunityto move.
32Since we are in the neighborhood of Z1 = Z2 and Zc = m(Mc)zc where zc is a weighted average ofproducers’ abilities and m(·) ) 1, the highest ability producer in city 1 must have ability z > Z1 , Z2.
33Such a producer exists so long as not everyone wishes to locate in a single city. We provide su"cientconditions on parameters such that a two-city asymmetric equilibrium exists in section A.4.
35
producer in city 2 with ability z lower than the ability of the highest-z tradables producer in
city 1, then the lowest-z tradables producer in city 2 has a positive net benefit of relocating
to city 1. This movement is followed by subsequent movements until we reach a tradables
producer in city 2 who is indi!erent between the two locations. We know that we reach
the indi!erent producer while L2 > L1 because Z2 > Z1. At this point, all the individuals
located in city 2 wish to remain in city 2.
Next, individuals in city 1 have the opportunity to relocate. If there is a tradables
producer in city 1 with ability z greater than the ability of the lowest-z tradables producer in
city 2, then the highest-z tradables producer in city 1 has a positive net benefit of relocating
to city 2. This process continues ad infinitum until the ability of the lowest-z tradables
producers in city 2 is equal to the ability of the highest-z tradables producer in city 1. At
that point, no tradables producers wish to move and we have obtained an equilibrium with
heterogeneous cities.
A.4 Existence of two-city asymmetric equilibrium
Here we characterize su"cient conditions for parameter values such that there exists a two-
city equilibrium in which L1 < L2. Since the two cities di!er in size, any equilibrium will
exhibit sorting among tradables producers z $ zm. The allocation of non-tradables producers
z < zm is both indeterminate and inessential. zm is given by s =" zm
0 µ(z)dz.
We start by guessing L1 ) 12L, which implies L2 = L " L1. Define the values zb and zb,s
by
(1 " s)L1 = L
! zb
zm
µ(z)dz sL1 = L
! zb,s
µ(z)dz
The locational assignments
µ(z, 1) =
-...../
.....0
µ(z) 0 ) z < zb,s
0 zb,s ) z < zm
µ(z) zm ) z < zb
0 zb ) z
µ(z, 2) =
-...../
.....0
0 0 ) z < zb,s
µ(z) zb,s ) z < zm
0 zm ) z < zb
µ(z) zb ) z
satisfy equations (5), (6), and (8). We then suppose equations (7) and (9) through (13) hold
true, where Mc > 0 if there is a value of Mc > 0 satisfying those equations.
36
If all tradables producers are in their optimal location, so that equation (14) is satisfied,
then the value of L1 that we guessed supports an asymmetric equilibrium. To check whether
this holds, we define an expression #(L1) that is utility in the smaller city minus utility in
the larger city for the marginal tradables producer, zb.
#(L1) *!
1 " s
$L!
2 " L!1) " (z(zb, Zc(zb, 1)) " z(zb, Zc(zm, zb)))
where, in an abuse of notation, Zc(x, y) is the maximum value of Zc satisfying equation (10)
when x and y are the lower and upper limits of integration and µ(z, c) = µ(z). # can be
written solely as a function of L1 because all the other variables in a sorting equilibrium
are given by L1 via zb,s and zb through the locational assignments and other equilibrium
conditions. The marginal tradables producer is indi!erent between the two locations when
#(L1) = 0, and all the inframarginal tradables producers are in their optimal locations by
the supermodularity of z in z and Zc.
In an asymmetric equilibrium, learning must occur in the larger city and all tradables
producers located in the larger city must participate in learning. Otherwise, they would raise
their utility by locating in the smaller city with lower local prices. A su"cient condition for
full-participation learning to occur in the larger city in equilibrium is to assume a value of A
and functional form m(·) such that full-participation learning occurs in the larger city for all
potential values of zb. That is, let A be su"ciently large and m(·) approach one su"ciently
quickly that -Z2 > 0 satisfying equations (16) through (19) for all L1 : 0 < L1 < 12L.
Z2 = m(M2)z2 (16)
M2 = L
! '
zb
(1 " #z,2)µ(z)dz (17)
z2 =
! '
zb
(1 " #z,2)µ(z)" 'zb
(1 " #z,2)µ(z)dzdz (18)
#z,2 = arg max"![0,1]
#z(1 + (1 " #)AZ2z) (19)
#$
L2
%< 0, since the cities are equally sized, equalizing housing and non-tradable services
prices, but they di!er in Zc, with Z2 > Z1.
We require limL1(0 #(L1) > 0, so that the entire population does not live in a single city in
equilibrium. This requires $1#sL
! > z(zm, Zc(zm, 1))" z(zm, Zc(zm, zm)) = z(zm, Zc(zm, 1))"zm. In words, provided that congestion costs are su"ciently strong relative to idea-exchange
37
benefits, the second location will not be empty.
With limL1(0 #(L1) > 0 and #$
L2
%< 0, continuity delivers the existence of a L1 such
that #(L1) = 0. We now describe where # is continuous and why its discontinuities are not
a problem.
#(L1) *!
1 " s
$L!
2 " L!1) " (z(zb, Zc(zb, 1)) " z(zb, Zc(zm, zb)))
$1#s
$L!
2 " L!1) is obviously continuous in L1.
We assume that µ(z) is continuous in z. Since #z,c is a function of Zc, and Mc and zc
are functions of #z,c, the equilibrium value of Zc satisfying equations (9) through (11) is
where the function m(Mc)zc intersects the 45-degree line. Since #z,c is continuous in Zc, and
Mc and zc are continuous in Zc and zb, m(Mc)zc is continuous in Zc and zb. This means
that Z2(zb, 1) is a continuous function of L1. z(z, Zc) is continuous in its arguments. Thus,
z(zb, Zc(zb, 1)) is continuous in L1.
Figure 4: Finding the fixed points of Z1(zm, zb)
10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
z b = .6
zb = .7zb = .8zb = .9
45°
Note: z ' U(0, 1), A = 6, zm = .5, m(Mc) = exp(30Mc)#1exp(30Mc)
, L = 2
Z1(zm, zb) is (weakly) increasing in L1. Z1(zm, zb) is not continuous in L1. For su"ciently
small values of L1, there is no value of Z1 > 0 satisfying equations (9) through (11). The
smaller size and lower abilities of the smaller city’s population rule out an equilibrium with
idea exchange. When L1 becomes su"ciently large that there is a value of Z1 satisfying
38
equations (9) through (11), there is a discontinuous increase in Z1(zm, zb) at this point
because the maximum value of Z1 given the population jumps from zero to a positive number.
This causes a discontinuous increase in # at this value of L1. Z1(zm, zb) is continuous in L1
for greater values of L1 by the continuity of #c,z, Mc, and zc in zb, Z1, and L1. An example
of how these fixed points vary with zb (which is determined by L1) is illustrated in Figure 4.
If Z1(zm, zb) = 0 %L1 & (0, 12L), then Z1 is continuous in L1 and # is continuous in L1.
If Z1 > 0 for some L1 & (0, 12L), then Z1(zm, zb) and # discontinuously increase at one value
of L1 and are continuous everywhere else in (0, 12L).
Since limL1(0 #(L1) > 0, #$
L2
%< 0, and # increases at any point at which # is not
continuous in L1, there exists a value of L1 such that #(L1) = 0. Two examples of this are
illustrated in Figure 5.
Figure 5: -L1 : #(L1) = 0
10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.8
-0.2
-0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
L1
(L1)
A = 5
A = 7
Note: z ' U(0, 1), s = .5, # = .25, $ = .5, m(Mc) = exp(30Mc)#1exp(30Mc)
, L = 2
In summary, the existence of an equilibrium with two heterogeneous cities requires that
congestion costs are su"ciently strong so that not everyone will locate in a single city in
equilibrium and that the potential productivity gains from idea exchanges are su"ciently
high that everyone in the larger city will spend time learning in equilibrium. We have
formalized su"cient conditions for these outcomes as ! and s taking values such that $1#sL
! >
z(zm, Zc(zm, 1))"zm and A be su"ciently high and m(·) approaching one su"ciently quickly
that - Z2 > 0 satisfying equations (16) through (19) for all L1 : 0 < L1 < 12L.
39
A.5 Skill premia in a two-city equilibrium
A.5.1 Pareto distribution
In an asymmetric two-city equilibrium with z distributed Pareto, µ(z) = kbk
zk+1 with k > 1,
the skill premium in the larger city is higher when
R #zb
z2(z)µ(z)dzR #
zbµ(z)dz
ps,2>
R zbzm
z1(z)µ(z)dzR zb
zmµ(z)dz
ps,1./
R #zb
z2(z)µ(z)dzR #
zbµ(z)dz
R zbzm
z1(z)µ(z)dzR zb
zmµ(z)dz
>ps,2
ps,1
We now show that this condition is always true. In steps:
1. Because z2(z) is increasing in z, for any z $ zb the following inequality holds:
R zzb
z2(z)µ(z)dzR z
zbµ(z)dz
R zbzm
z1(z)µ(z)dzR zb
zmµ(z)dz
<
R #zb
z2(z)µ(z)dzR #
zbµ(z)dz
R zbzm
z1(z)µ(z)dzR zb
zmµ(z)dz
2. Define a change of variables by f(z) =$z#k
b + z#k " z#km
%$1k and z =
$2z#k
b " z#km
%$1k
such that" z
zbz2(z)µ(z)dz =
" zb
zmz2(f(z))µ(f(z))f %(z)dz. By construction µ(z) = µ(f(z))f %(z).
3. z2(f(zm))z1(zm) = z2(zb)
ps,1> ps,2
ps,1because z2(zb) > ps,2.
4. z2(f(z))z1(z) is increasing in z, so z2(f(z)) > ps,2
ps,1z1(z) %z & (zm, zb).
5. Multiplying by µ(z)and integrating yields" zb
zmz2(f(z))µ(z)dz > ps,2
ps,1
" zb
zmz1(z)µ(z)dz.
ThusR zb
zmz2(f(z))µ(z)dz
R zbzm
z1(z)µ(z)dz> ps,2
ps,1and
R zbzm
z2(f(z))µ(z)dzR zb
zmz1(z)µ(z)dz
=
R zbzm z2(f(z))µ(f(z))f "(z)dz
R zbzm µ(f(z))f "(z)dzR zbzm z1(z)µ(z)dz
R zbzm µ(z)dz
> ps,2
ps,1
6. Therefore,
R #zb
z2(z)µ(z)dzR #
zbµ(z)dz
R zbzm
z1(z)µ(z)dzR zb
zmµ(z)dz
>
R zzb
z2(z)µ(z)dzR z
zbµ(z)dz
R zbzm
z1(z)µ(z)dzR zb
zmµ(z)dz
=
R zbzm
z2(f(z))µ(f(z))f "(z)dzR zb
zmµ(f(z))f "(z)dz
R zbzm
z1(z)µ(z)dzR zb
zmµ(z)dz
>ps,2
ps,1
40
Only the fourth step ( z2(f(z))z1(z) is increasing in z) requires further elaboration.
d
dz
' z2(f(z))
z1(z)
(=
z1(z)z"2(f(z))f
"(z) " z2(f(z))z
"1(z)
z1(z)2
z"
c(z) =1
2(AZcz + 1)
zc(z) =1
AZc(z
"
c(z))2
d
dz
' z2(f(z))
z1(z)
(=
1
Az1(z)2
'z"
2(f(z))z"
1(z)(' z
"1(z)
Z1f %(z) " z
"2(f(z))
Z2
(
=1
Az1(z)2
'z"
2(f(z))z"
1(z)(' f %(z)
Z1" 1
Z21 23 4>0
+ A(f %(z)z " f(z))1 23 4>0
(
Those inequalities are true because
f(z) =$z#k
b + z#k " z#km
%$1k
f %(z) =$z#k
b + z#k " z#km
%$1$kk z#1#k
=
*1 +
z#k " z#km
z#kb
+$1$kk 'zb
z
(k+1> 1
f %(z)z " f(z) = f(z)
*z#k
m " z#kb
z#kb " z#k
m + z#k
+> 0
A.5.2 Uniform distribution
In an asymmetric two-city equilibrium with z ' U(z, z), the skill premium in the larger city
is higher when
R zzb
z2(z) 1z$z dz
R zzb
1z$z dz
ps,2>
R zbzm
z1(z) 1z$z dz
R zbzm
1z$z dz
ps,1./ zb " zm
z " zb
" z
zbz2(z)dz
" zb
zmz1(z)dz
>ps,2
ps,1
A su"cient condition for this to be true in equilibrium is zm > z2b . In steps:
1. By change of variable," z
zbz2(z)dz =
" zb
zmz2(f(z))f %(z)dz, where f(z) = zb + z#zb
zb#zm(z "
zm). Therefore" zb
zmz2(f(z))dz = 1
f "(z)
" z
zbz2(z)dz = zb#zm
1#zb
" z
zbz2(z)dz.
2. z2(f(zm))z1(zm) = z2(zb)
ps,1> ps,2
ps,1
3. If zzm > z2b , then z2(f(z))
z1(z) is increasing in z, so z2(f(z)) > ps,2
ps,1z1(z) %z & (zm, zb).
41
4. Integrating," zb
zmz2(f(z))dz > ps,2
ps,1
" zb
zmz1(z)dz
5. Therefore,R zb
zmz2(f(z))dz
R zbzm
z1(z)dz= zb#zm
z#zb
R zzb
z2(z)dzR zb
zmz1(z)dz
> ps,2
ps,1. The skill premium is higher in the
larger city.
zzm > z2b is su"cient for the third step because
d
dz
' z2(f(z))
z1(z)
(=
1
Az1(z)2
'z"
2(f(z))z"
1(z)(' f %(z)
Z1" 1
Z21 23 4>0
+A(f %(z)z " f(z))(
f %(z) =z " zb
zb " zm> 1
zzm > z2b / f %(z)z " f(z) > 0 / d
dz
' z2(f(z))
z1(z)
(> 0
zzm > z2b is far from necessary. In fact, when it fails is when zb is relatively large,
which means that the two cities are relatively similar in size. But this similarity in size
causes a similarity in housing prices, which diminishes the compensation e!ect relative to
the compositional and learning e!ects. We have not found a set of parameter values yielding
a two-city equilibrium in which the skill premium is lower in the larger city.
A.6 Migration and distance
Here we characterize migration flows for a special case of the outsourcing model and show
that they imply that the average migration of non-tradables producers will be shorter than
that of tradables producers. Suppose that there are n cities in equilibrium, with ns “skilled
cities” outsourcing their assembly activities to nu “unskilled cities”, such that ns + nu = n.
Denote the set of skilled cities by Cs and the set of unskilled cities by Cu. We denote gross
migration flows of the unskilled from city c to c% by xc,c" and gross migration flows of the
skilled by yc,c" . The cost of migrating from c to c% is arbitrarily small and proportionate to
the distance between the cities, d(c, c%) = d(c%, c).
Denote the lowest ability tradables producers in the skilled cities by zb,1. With arbitrarily
small migration costs, newborn tradables producers of ability z $ zb,1 whose ability lies
outside the skill interval of their birthplace migrate to their unique destination. Tradables
producers of ability zm ) z ) zb,1 born in skilled cities migrate to the unskilled cities in
order to support the steady-state population levels while minimizing migration costs. Some
42
workers who do not produce tradables migrate from skilled cities to unskilled cities in order
to support the steady-state population levels while minimizing migration costs.
If the bilateral distances between cities are orthogonal to their population characteristics
and nu > 1, then the expected migratory distance of tradables producers (z $ zm) exceeds
the expected migratory distance of unskilled workers (z ) zm). Gross migratory flows of the
unskilled are arranged so as to minimize migration costs, while only a fractionR zb,1
zm µ(z)dzR #zm
µ(z)dzof
gross flows of tradables producers are arranged to minimize migration costs.
By optimal choices of outsourcing destinations, unskilled cities exhibit identical prices
and total population. Suppose that they also have identical ratios of tradables producer
population to total population,L
R #zm
µ(z,c)dz
Lc. The gross migratory flows of unskilled workers
and tradables producers of ability zm ) z ) zb,1 solve
min{xc,c"}
&
c!Cu
&
c"!Cs
xc",cd(c%, c) subject to&
c"!Cs
xc",c =1
nu
&
c"!Cs
Lc"%&l %c
min{yc,c"}
&
c!Cu
&
c"!Cs
yc",cd(c%, c) subject to&
c"!Cs
yc",c =1
nu
&
c"!Cs
Lc"%&
! zb,1
zm
µ(z)dz %c
Denote the optimal solutions x) and y). Due to linearity, the optimal solutions are
proportionate to each other. Denote the fraction wc =
R #zb,1
µ(z,c)dzR #
zb,1µ(z)dz
.
The average distance migrated by unskilled individuals to city c is
&
c"!Cs
nux)c",c5
c""!CsLc""%&l
d(c%, c)
The average distance migrated by unskilled individuals is
&
c!Cu
1
nu
&
c"!Cs
nux)c",c5
c""!CsLc""%&l
d(c%, c)
The average distance migrated by skilled individuals to city c & Cu is
&
c"!Cs
nuy)c",c5
c""!CsLc""%&
" zb,1
zmµ(z)dz
d(c%, c)
43
The average distance migrated by skilled individuals is
" zb,1
zmµ(z, c)dz
" 'zm
µ(z)dz
&
c!Cu
1
nu
&
c"!Cs
nuy)c",c5
c""!CsLc""%&
" zb,1
zmµ(z)dz
d(c%, c) +
" 'zb,1
µ(z, c)dz" '
zmµ(z)dz
&
c
&
c"
wcwc"d(c%, c)
If the bilateral distances between cities are orthogonal to their other characteristics, then
by the optimality of x) the following inequality holds:
&
c!Cu
&
c"!Cs
x)c",c5
c""!CsLc""%&l
d(c%, c) )&
c
&
c"
wcwc"d(c%, c)
Then, because x) is proportionate to y),
" zb,1
zmµ(z, c)dz
" 'zm
µ(z)dz
&
c!Cu
&
c"!Cs
y)c",c5
c""!CsLc""%&
" zb,1
zmµ(z)dz
d(c%, c) +
" 'zb,1
µ(z, c)dz" '
zmµ(z)dz
&
c
&
c"
wcwc"d(c%, c) $
&
c!Cu
&
c"!Cs
x)c",c5
c""!CsLc""%&l
d(c%, c)
The expected average distance migrated by a skilled individual is greater than the expected
average distance migrated by an unskilled individual.
B Parameterization
Parameterizing the model means picking a function m(·), a distribution µ(z), and values
for A, s, !, ", and L. In the parameterizations we present in this paper, we use m(Mc) =exp(%Mc)#1exp(%Mc)
with ( = 30. We choose µ(z) = 1 when 0 ) z ) 1, so that z ' U(0, 1). There is
no assembly or outsourcing (l = 0), and we do not address life-cycle migration.
To produce the two-city wage schedule in Figure 2, we chose A = 6, s = .5, ! = .25, " =
.5, L = 2. To produce the four-city wage schedule in Figure 3, we chose A = 6, s = .5, ! =
.3, " = .3, L = 4.
44
C Data and estimates
C.1 Data description
Data sources: Our population data are from the US Census website (1990, 2000, 2007). Our
data on individuals’ wages, education, demographics, and housing costs come from public-
use samples of the decennial US Census and the annual American Community Survey made
available by IPUMS-USA (Ruggles, Alexander, Genadek, Goeken, Schroeder, and Sobek,
2010). We use the 1990 5%, and 2000 5% Census samples and the 2005-2007 American
Community Survey 3-year sample. We use the 2005-2007 ACS data because ACS data from
2008 onwards only report weeks worked in intervals.
Wages: We exclude observation missing the age, education, or wage income variables.
We study individuals who report their highest educational attainment as a high-school
diploma or GED or a bachelor’s degree and are between ages 25 and 55. We study full-
time, full-year employees, defined as individuals who work at least 40 weeks during the year
and usually work at least 35 hours per work. We obtain weekly and hourly wages by divid-
ing salary and wage income by weeks worked during the year and weeks worked times usual
hours per week. Following Acemoglu and Autor (2011), we exclude observations reporting
an hourly wage below $1.675 per hour in 1982 dollars, using the GDP PCE deflator. We
define potential work experience as age minus 18 for high-school graduates and age minus
22 for individuals with a bachelor’s degree. We weight observations by the “person weight”
variable provided by IPUMS.
Housing: To calculate the average housing price in a metropolitan statistical area, we
use all observations in which the household pays rent for their dwelling that has two or
three bedrooms. We do not restrict the sample by any labor-market outcomes. We drop
observations that lack a kitchen or phone. We calculate the average gross monthly rent for
each metropolitan area using the “household weight” variable provided by IPUMS.
College ratio: Following Beaudry, Doms, and Lewis (2010), we define the “college ratio”
as the number of employed individuals in the MSA possessing a bachelor’s degree or higher
educational attainment plus one half the number of individuals with some college relative
to the number of employed individuals in the MSA with educational attainment less than
college plus one half the number of individuals with some college. We weight observations
by the “person weight” variable provided by IPUMS.
45
Note that both income and rent observations are top-coded in IPUMS data.
Geography: We map the public-use microdata areas (PUMAs) to metropolitan statisti-
cal areas (MSAs) using the “‘MABLE Geocorr90, Geocorr2K, and Geocorr2010” geographic
correspondence engines from the Missouri Census Data Center. For 1990 and 2000, we con-
sider both primary metropolitan statistical areas (PMSAs) and consolidated metropolitan
statistical areas (CMSAs). The 2005-2007 geographies are defined by core-based statistical
areas (CBSAs). In some sparsely populated areas, only a fraction of a PUMA’s population
belongs to a MSA. We include PUMAs that have more than 50% of their population in a
metropolitan area. Figure 1 and Table 1 describe PMSAs in 2000.
Migration: We study individuals in the 2000 Census public-use microdata who are
born in the United States, 30 to 55 years of age, whose highest educational attainment
is a high school degree or a bachelor’s degree, and who currently live in a metropolitan
area as identified by the “metaread” IPUMS variable. We identify residence changes over
the five-year span using the “migrate5d” variable. We identify metropolitan changes by
comparing the “migmet5” and “metaread” variables for individuals who lived in an identified
metropolitan area five years earlier. We calculate distances between public-use microdata
areas using the latitude and longitude coordinates of the PUMAs’ centroids, calculated from
US Census cartographic boundary files. We assign residences changes that do not change
PUMAs a distance of zero.
C.2 Empirical estimates
Our empirical approach is to estimate cities’ college wage premia and then study spatial
variation in those premia. Our first-stage estimates of cities’ skill premia are obtained by
comparing the average log weekly wages of full-time, full-year employees whose highest edu-
cational attainment is a bachelor’s degree to those whose highest educational attainment is
a high school degree.
Our first specification uses the di!erence in average log weekly wages y in city c without
any individual controls as the first-stage estimator. The dummy variable collegei indicates
that individual i is a college graduate. Expectations are estimated by their sample analogues.
premiumc = E(yic|collegei = 1) " E(yic|collegei = 0)
Our second approach uses a first-stage Mincer regression to estimate cities’ college wage
46
premia after controlling for experience, sex, and race. The first-stage equation describing
variation in the log weekly wage y of individual i in city c is
yi = "Xi + )c + $ccollegei + *i
Xi is a vector containing years of potential work experience, potential experience squared, a
dummy variable for males, and dummies for white, Hispanic, and black demographics. The
estimated skill premium in each city, $c, is the dependent variable used in the second-stage
regression. We refer to these estimates as “composition-adjusted skill premia.”
One may be inclined to think that the estimators that control for individual characteristics
are more informative. But if di!erences in demographics or experience are correlated with
di!erences in ability, controlling for spatial variation in skill premia attributable to spatial
variation in these factors removes a dimension of the data potentially explained by our
model. To the degree that individuals’ observable characteristics reflect di!erences in their
abilities, the unadjusted estimates of cities’ skill premia are more informative for comparing
our model’s predictions to empirical outcomes.
Table 3 shows the correlation between estimated skill premia and population sizes for
various years and geographies. These coe"cients are akin to those appearing in the first
column of Table 1.
Table 3: Skill premia and metropolitan populations
1990 1990 2000 2000 2005-7PMSA CMSA PMSA CMSA CBSA
Skill premia 0.0145** 0.0133** 0.0327** 0.0285** 0.0411**(0.00395) (0.00398) (0.00380) (0.00367) (0.00377)
Composition-adjusted 0.0129** 0.0128** 0.0282** 0.0244** 0.0288**skill premia (0.00314) (0.00319) (0.00326) (0.00310) (0.00331)
Observations 322 271 325 270 353Robust standard errors in parentheses
** p<0.01, * p<0.05
Note: Each cell reports the coe"cient and standard error from an OLS regression of the estimated collegewage premia on log population (and a constant). The sample is full-time, full-year employees whose highesteducational attainment is a bachelor’s degree or a high-school degree.
47