Data Resources – Hopkins Housing & Health Collaborative
New Insights into Connection Between Housing Quality and Population Health
Read the latest Blog article from Johns Hopkins Nursing Magazine.
Low Income Housing Tax Credit (LIHTC) Building Data
Data on properties funded by the Low-Income Housing Tax Credit (LIHTC) are available online from HUD (https://www.huduser.gov/lihtc/), most recently updated through 2023.
HUD maintains a “properties” data file which includes addresses for one building in each facility, typically the main building or an administrative office, as well as several facility-level characteristics such as year placed in service and facility size. HUD also maintains a “buildings” data file which includes just a property ID (“hud_id”) that links back to the property data file, and all building-level addresses affiliated with a given property, which can be numerous.
The addresses in each data file require significant cleaning. Available below are two .R files that can be used to clean these data using the methods described in Gensheimer et al. 2022, according to the instructions provided. The cleaned 2023 data is also available for download below (‘LIHTC_clean.csv”).
This code is open-source, and if it is used in future projects, please cite Kaplan et al. (2026) and Gensheimer et al. (2022).
LIHTC Building Data Files

Instructions for cleaning LIHTC data:
- Download LIHTC data from the HUD website (https://www.huduser.gov/lihtc/).
- Run ‘01_LIHTC_cleaner_functions.R’
- Run ‘02_LIHTC_cleaner.R’, filling in relevant file paths. The primary output is `LIHTC_clean.csv`, which has a row for each building address, linked to the associated ID code for the LIHTC development. (In other words, each LIHTC development may have multiple different addresses representing different buildings belonging to the same overall project, each of which receives its own row in `LIHTC_clean.csv`.)
*Please note- the full set of variables available from HUD in the property file is not included in`LIHTC_clean.csv` but can be readily linked using `hud_id`.
Housing Quality Metric
Researchers, policymakers, and practitioners have lacked small-area data to understand how housing quality issues vary across the US. The Housing Quality Metric (HQM) addresses this gap by providing novel data on poor-quality housing at the census tract level nationwide. Led by Collaborative member Veronica Helms, the HQM provides estimated rates of housing quality across three domains: physically inadequate housing, housing cost burden, and poor neighborhood perceptions. The data was created by applying estimates obtained from the 2021 American Housing Survey to data from the American Community Survey. A new article in the American Journal of Public Health details the approach and shows its initial associations with population health. The work, led by Helms, also includes Alyssa J. Moran, Thomas Cudjoe, Eliana M. Perrin, and Craig Pollack
The HQM is available for download here l along with a data dictionary . Estimates of poor-quality housing are available for census tracts, counties, and states.
Housing Quality Metric Data Files
Housing Quality Metric (HQM): Census Tracts Scoring in the Highest Quartile for all Three Poor Housing Quality Dimensions (n=8,329)

SEER-Medicare: Housing Assistance Data
New data are available to investigate the relationship between federal housing assistance and cancer.
The National Cancer Institute and the SEER registries partnered with the Department of Housing and Urban Development (HUD) to link the SEER-Medicare data with HUD administrative data on public and assisted housing program participants. The linkage of these data allows researchers to examine associations between housing assistance status and cancer presentation, treatment, and outcomes.
Collaborative members Craig Pollack, Veronica Helms, and Taylor Johnson were among the authors of a paper in the Journal of the National Cancer Institute that describes this data linkage.
More information on accessing the data can be found here.

SEER-Medicare: Housing Assistance Data
