This repository contains the code of the 2016 JAMA paper: The Association between Income and Life Expectancy in the United States, 2001 - 2014, by Raj Chetty, Michael Stepner, Sarah Abraham, Shelby Lin, Benjamin Scuderi, Nicholas Turner, Augustin Bergeron, and David Cutler. The Journal of the American Medical Association (2016), Vol 315, No. 14.
Frina Lin and Jeremy Majerovitz provided outstanding research assistance and contributed to developing this code.
For more information about the results of this project and to download the data and results, see www.healthinequality.org. For more information on how to use this code to replicate the results, read on.
All of the files in this repository (with the exception of code/readme.md and those contained under code/ado_ssc and code/ado_maptile_geo) are released to the public domain under a CC0 license to permit the widest possible reuse. If you use our code or data, we ask that you cite our 2016 JAMA paper.
The coding style guide contained in code/readme.md is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. The files contained in the code/ado_ssc folder were obtained from Stata's SSC software repository, and are subject to their own respective licenses. The files contained in the code/ado_maptile_geo folder were obtained from the maptile geography template website, and are subject to their own respective licenses.
Our code consists of three pipelines, which connect together to process our raw data into the tables, figures and numbers published in our paper. This process can be visualized from start to finish as:
raw_data ---mortality_init, mortality_datagen---> derived_data ---mortality---> scratch,results
Each pipeline is run using the -project- command in Stata, which is a build automation tool. In the root folder of our repository, there are three do-files:
mortality_init.do
is the initial data generation pipeline. It processes raw mortality rates that we are not authorized to post publicly into more aggregated data that we can post publicly.mortality_datagen.do
is the data generation pipeline: it processes raw data into derived data.mortality.do
is the analysis pipeline: it processes raw data and derived data (mostly derived data) into results.There is also a mortality_tests.do
pipeline that runs some unit tests, which verify that our code is working as expected. It does not need to be run to replicate our results.
The code is written in Stata, and has been tested in Stata versions 13 and 14.
If you are not a git user, you can download a copy of the code by clicking the green Clone or download button above and selecting Download ZIP. Unzip the file wherever you'd like on your computer. The folder you create will be referred to below as the root folder of the mortality project.
If you are a git user, you can clone this repository using your favorite git client.
First download the data-only ZIP file (1 GB). This contains all the data that you need to run the mortality_datagen.do
(data generation) and mortality.do
(analysis) pipeline. You'll need to run each of them to produce the results.
If you wish, you may additionally download the derived results ZIP file (2 GB). This contains all of the derived data and results generated when the code is run to completion. Combined with the data-only ZIP file, you will be able to run any individual code file independently without running the entire pipeline (since all the intermediate files are included). You can also use this file to compare our original results with your replication output.
Each ZIP file you download should be unzipped in the root folder of the mortality project. After unzipping, that folder will contain a code subfolder and a data subfolder (and possibly others).
doedit "`c(sysdir_personal)'/profile.do"
. This will open a do-file editor to the profile.do file in your PERSONAL folder.
personal
in Stata.global mortality_root "<path_to_mortality_root>"
. Save the do-file.ls "$mortality_root/"
, you should see code and data among the printed results. If you do, then everything is set up correctly.ssc install project
project, setup
in Stata, then navigate to mortality.do in the root folder. Decide whether you want plain-text or SMCL log files, then hit OK.project, setup
again, and choose mortality_datagen.do this time.project mortality_datagen, build
global reps 1000
.project mortality, build
The first time you run a pipeline, it will take some time to generate everything. On subsequent runs, it will only update files whose code or dependencies have changed.
The code for the mortality_init.do
and mortality_tests.do
pipelines are included in the replication files for completeness, but you will not be able to run those pipelines. The data files that they process are cannot be posted publicly, so they are not included in the data download ZIP files.
In the mortality_init.do
pipeline, we use a Julia program to quickly calculate Gompertz parameters from mortality rates.
There is no need to install or configure Julia in order on your computer to replicate our results. The data for the mortality_init.do
pipeline cannot be posted publicly, so that data is not included in our data ZIP files. And the mortality_datagen.do
and mortality.do
pipelines, which you can run by following the instructions above, do not use any Julia code.
Nevertheless, in the name of complete documentation of our data processing, we are including instructions on how to configure Julia to run the mortality_init.do
pipeline code.
The Julia code for Gompertz estimation under code/ado/estimate_gompertz.jl has been tested and run in Julia version 0.6.0 with packages:
The Julia website has instructions for installing the command line version. If you are using a Mac and have installed Homebrew, you can install Julia fairly easily by opening a terminal and running brew cask install julia
.
You will also need to install the required Julia packages. To do so, open julia on the terminal by running julia
. In the Julia terminal, run:
Pkg.add("GLM")
Pkg.add("DataFrames")
Pkg.add("ArgParse")
By default, the Gompertz estimation will be performed (very slowly) in Stata. To switch to using Julia for Gompertz estimation, first find out the path to the Julia executable on your computer by opening a terminal and running which julia
. Then create a file at code/set_julia.do with the following contents, replacing <PATH_TO_JULIA>
with the actual path on your system:
global juliapath <PATH_TO_JULIA>
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。