Deleting data or stopping its collection will erase years of valuable brain research

As an undergraduate student, I immersed myself in neuroscience research; I cared for weakly electric fish, studied receptor expression in the brain and used brain imaging to examine how exercise affects motor memory. I quickly realized that I loved neuroscience—but not collecting data.

Fortunately, computational neuroscience offered an alternative. My graduate training coincided with an explosion in the use of big data for neuroscientific research. The Allen Human Brain Atlas captured spatial gene-expression patterns. The Human Connectome Project mapped structural and functional connections. The Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA) Consortium integrated neuroimaging and genetics to help scientists understand brain structure, function and disease. The resulting datasets, used in tens of thousands of research publications, have advanced our understanding of the brain in unprecedented ways, enabling discoveries that would have otherwise been impossible.

In conjunction, institutions worldwide expanded their data-science education, both in traditional academic settings and through online platforms, such as Coursera, EdX and Neuromatch Academy. I learned to analyze large-scale datasets and develop machine-learning models to understand how brain organization and cognition differ between males and females. When the COVID-19 pandemic restricted lab access, many scientists, especially trainees, were forced to pivot to computational approaches.

All of these factors have contributed to the launch of a new generation of scientists with expertise in data science. Many of us have built our research programs around these remarkable open resources. But new U.S. federal restrictions on the types of data that can be collected, stored and shared are putting those efforts in jeopardy.

Data form the foundation of modern science, supporting discoveries that are not bound by borders, beliefs or biases. At a time when unsubstantiated opinions are often presented as facts, data are our most powerful tool to separate truth from misinformation. But data are now under attack.

The scientific community must act now to protect these essential resources. We need coordinated efforts to create secure, decentralized, open-access archives of public datasets before more data are lost or modified. These efforts must be accompanied by transparent systems to monitor and document changes to scientific datasets, ensuring accountability and maintaining integrity.

n February 2023, I launched my independent research laboratory, where we leverage existing datasets to explore questions at the intersection of sex, gender, neuroscience and psychiatry. This approach enabled me to hit the ground running without the delays of recruiting participants and setting up extensive data-collection protocols. Our analyses yielded important insights about how sex and gender uniquely map onto the brain and revealed that shared and distinct brain networks underlie psychopathology across the sexes. Excited by these findings, I began writing grants to fund our research in this area.

Then, in January 2025, everything changed.

Federal funding freezes, employment terminations and executive orders transformed the scientific landscape in the United States. Clinical trials came to a halt, research grants were terminated, and scientists lost their jobs. These immediate disruptions were followed by a widespread attack on data. Thousands of web pages and datasets from public repositories were removed or modified to comply with executive orders and policy changes in the U.S., one of the world’s largest producers of scientific data. The targeted research covered critical public health and social science topics, including diversity, gender, vaccines and HIV/AIDS. This systematic removal of data affects years of valuable research and taxpayer-funded investments, and jeopardizes future scientific progress.

One affected resource is the Adolescent Brain Cognitive Development (ABCD) Study, the largest and most comprehensive study of adolescent brain development ever conducted. Following more than 11,000 children since 2016, this study has led to significant advances in our understanding of development, generating more than 1,400 publications thus far, including several of my own. The study had been collecting comprehensive gender-related data, making it particularly valuable for the gender research we were doing in my lab. But recent policy changes forced the removal of all gender-related information from the latest data release, and these data will no longer be collected.

The ABCD Study is not the only dataset in which data are being removed or modified. The Demographic and Health Surveys Program, which includes data on population, mental health, nutrition and other factors from more than 400 surveys in 90-plus countries, has been paused. A comprehensive review of U.S. government datasets revealed that between 20 January and 25 March 2025, nearly half of the 232 datasets examined had been substantially altered. Those changes have been logged for only 13 percent of altered datasets. Without an accurate record these modifications, it’s difficult to trust the accuracy of the data that remain.

The impact of these changes extends beyond the immediate loss of existing data. The discontinuation of data collection for specific variables in ongoing studies creates permanent gaps in our scientific understanding. Unlike cross-sectional research, longitudinal studies provide unique insights into how individual people change over time. Once specific temporal windows pass without data collection, those gaps can never be filled. This systematic elimination of variables from ongoing research will prevent scientific progress and create blind spots in our understanding that may persist for generations.

ome efforts to preserve important datasets are already in the works, including the Data Rescue Project as a central hub for data rescue-related efforts, the Harvard Law School Library Data.gov Archive of federal public datasets, and the GovWayback tool to access historical versions of U.S. government websites. These efforts can protect some public datasets, but federally funded datasets with data-use agreements that prevent sharing—such as the ABCD Study—are not included.

Ongoing longitudinal studies must develop contingency plans through international collaborations and private partnerships to ensure continued data collection and sharing if targeted. Without these safeguards, we risk losing irreplaceable data and the time, effort and resources that went into collecting them.

This attack on data also threatens the careers of scientists and their established research programs, which took years to develop. In my own laboratory, I pivoted away from gender research, abandoning two years of preliminary data analyses and grant proposals. The timing is particularly challenging; my startup funding will end soon, and I face increasing pressure to secure support for my trainees, laboratory and position. This struggle extends far beyond my lab. Across the field, scientists who have relied on these datasets must now rethink their approach.

To protect themselves from similar disruptions, I encourage early-career researchers to build resilient research programs. This requires diversification across research approaches, funding sources and scientific partnerships, finding a balance between focus and flexibility. It may include pursuing funding beyond federal grants, establishing international collaborations, leveraging data from other countries and designing adaptable research programs. Though these measures require additional effort, they will provide crucial protection against policy changes that could otherwise derail careers.

As scientists, our greatest strength lies in data. We use data to test hypotheses, prove or refute theories, and provide evidence that separates fact from fiction. When data disappear, we lose our ability to challenge false claims, and we lose our foundation for building new discoveries. Without data, science itself is at risk. At a time when misinformation spreads rapidly, the systematic removal of scientific data is halting research progress and undermining our ability to fight false narratives with empirical evidence. Collaboratively, we must preserve data access, establish international safeguards, and maintain research continuity to protect evidence-based research that advances human knowledge and improves public health.

Sign up for our weekly newsletter.

Catch up on what you missed from our recent coverage, and get breaking news alerts.

Deleting data or stopping its collection will erase years of valuable brain research

Sign up for our weekly newsletter.

Recommended reading

A community-designed experiment tests open questions in predictive processing

Neuroscience needs single-synapse studies

Neuroscience has a species problem

Explore more from The Transmitter

Neuroscience’s open-data revolution is just getting started

What U.S. science stands to lose without international graduate students and postdoctoral researchers

ABCD Study omits gender-identity data from latest release

Privacy Preference