Skip to Main Content

Disappearing Federal Data

What is happening to federal data sets?

Beginning in January 2025, many federal datasets, websites, and other previously accessible resources are being taken offline to comply with executive orders, most notably Centers for Disease Control and Protection (CDC), Environmental Protection Agency (EPA), and National Institutes of Health (NIH) data. Much of the data targeted is ostensibly related to health disparities among different demographics, especially race/ethnicity, gender, and sexuality. Because these variables are important factors in health research, however, many large and broad-scope data sets are affected. Evidence is growing that even datasets that remain accessible on an agency’s website may have scrubbed, corrupted, or otherwise altered information.

Learn more about missing or altered federal data:

The Journalists Resource: overview of the current situation from the Shorenstein Center at the Kennedy School 

Environmental Data & Governance Initiative (EDGI): an advocacy group for access to environmental data. 

Data Rescue Efforts

Data Rescue Efforts: an evolving list of crowd-sourced efforts to preserve and maintain accessibility to data. The website for the Data Rescue Project, which evolved from this data rescue initiative is now available at: https://www.datarescueproject.org/about-data-rescue-project/ and the Data Rescue Tracker is available here:  https://www.datarescueproject.org/data-rescue-tracker/

End of Term Crawl: an Internet Archive cache of government web sites, crawled and collected in the months between a presidential election and a presidential inauguration. 

GovWayback: a simple method for accessing historical versions of U.S. government websites from before January 20, 2025.  Some resources, like interactive websites, web forms, and contents behind password authentication are likely not included in GovWayback caches.

Harvard Library Innovation Lab: an effort from the Harvard Law School Library to provide access to major datasets from data.gov, PubMed, and federal GitHub repositories 

DataLumos, is an Inter-university Consortium for Political and Social Research (ICPSR) archive for valuable government data resources.  This international consortium of more than 760 academic institutions and research organizations maintains a data archive of more than 500,000 files of research in the social sciences, including 16 specialized collections of data in education, aging, criminal justice, substance abuse, terrorism, and other fields.