|
MPRC faculty associate Judy Hellerstein
(Economics), with David Neumark and colleagues, constructed new datasets based on a name and address match of long
form (one-in-six) respondents in the 1990 and 2000 Decennial Censuses to their employers. The construction of these
new matched data files, the 1990 and 2000 Decennial Employee-Employer Databases (DEEDs), is possible because on the
long form of the Census, employed respondents reported the business address of their current primary employer.
Hellerstein and her colleagues use this business address to match a worker to an employer, obtaining business
addresses of employers from the Business Register (formerly known as the Standard Statistical Establishment List, or
SSEL) for 2000, a master list at the Census Bureau of all business establishments in the U.S. The DEEDs contains,
for every matched worker, information from the long form and information on the establishment in which that
individual works. Similarly, information on every matched worker exists for every included establishment. The
opportunity to compare and contrast the earnings and characteristics of employees working for the same employer at
the same physical establishment provides one of the most unique features of the DEEDs.
The construction of the DEEDs was a major undertaking, using the name and address information written on the
questionnaires by long form respondents. Hellerstein worked with the Census Bureau to develop sophisticated
probabilistic matching algorithms that took into account the various spellings of street names and variations on
addresses as well as dealing with issues of measurement error in two data sources. This dataset is now available
through the Research Data Center network.
Hellerstein also made substantial progress on research associated with this project. Her paper (with David
Neumark), "Production Function and Wage Equation Estimation with Heterogeneous Labor: Evidence from a New Matched
Employer-Employee Data Set" is now forthcoming in a University of Chicago Press Volume. She and Neumark have also
completed (and submitted to a peer-reviewed journal) an NBER working paper (11599) entitled, "Workplace Segregation
in the United States: Race, Ethnicity, and Skill.” Both of these papers use the 1990 DEED. Finally, Hellerstein,
Neumark, and a graduate student Melissa McInerney recently completed a paper entitled, "Changes in Workplace
Segregation in the United States: Evidence from the 1990 and 2000 Employer-Employee Datasets," which uses data
from both the 1990 DEED and the beta-version of the 2000 DEED.
|
|
MPRC faculty associate John
Haltiwanger (Economics), with Julia Lane and John Abowd, is responsible for the LEHD project. Like the DEEDs,
the core of this project is the matching of existing data sources to build new datasets. The LEHD program has now
expanded to 40 states and is headed towards being a national program.
|

The LEHD @ the US Census Bureau |
The special feature of the LEHD is the matching of comprehensive state
administrative data on employers and employees with the business and economic censuses. Match rates are extremely
high (more than 97 percent) and with virtual universal coverage of workers and firms, the entire distribution of an
employer's workforce can be measured and analyzed. This dataset will allow researchers to study human capital,
immigration and low wage labor markets in much more detail (and at much lower levels of geographic aggregation) than
previously possible.
The LEHD has now been used in dozens of papers. However its potential to inform demographic research is just now
being exploited. MPRC faculty associate John
Haltiwinger (Economics) along with MPRC Director
Seth Sanders (Economics) and Fredrik Andersen
of the U.S. Census Bureau, use the LEHD data to examine immigration in the U.S., including exploring the adjustments
that firms make to absorb immigrants. As discussed above, for this work, and likely for other demographic work, the
LEHD data offers three unique advantages. First, it contains an extremely large panel on earnings, large enough to
conduct analyses on groups that are a small fraction of the population. Second, it allows observations on the firms
in which workers are employed, and has panel data on these firms. And third, it is linked to demographic surveys
which allows users to conduct analyses by demographic subgroups.
|
|