Lost principles: A transformer trained on simulated orbits could predict planets’ movements but failed to uncover the fundamental laws that govern them, highlighting the limits of these models’ predictive powers.
Illustration by Danielle Ezzo
Add us as a Preferred Source on Google

Can AI do neuroscience without understanding?

Prediction without understanding sustained astronomy through a thousand years of epicycles. Artificial intelligence is now offering neuroscience the same deal.

By Anthony Zador
27 April 2026 | 6 min read

Prediction and understanding have been two sides of the same coin for most of the history of science. Alan Hodgkin and Andrew Huxley captured the action potential with four variables and a handful of parameters. Their equations predicted the waveform, but they also explained it: Fast sodium in, slow potassium out, and the interplay between them generates a spike. The prediction was useful. The understanding was beautiful. And you couldn’t have one without the other.

Artificial intelligence (AI) is prying them apart.

Consider AlphaFold. Its predictive accuracy is extraordinary. But there is no explanatory model a human can internalize or reason from, no “aha!” AlphaFold doesn’t aid human understanding; it replaces the step where understanding would have happened.

Neuroscience is barreling down the same path. The transformer architectures behind models such as Claude now model population spike trains from Neuropixels recordings. Foundation models are beginning to ingest large-scale calcium imaging datasets. A growing roster of scientific AI companies, from startups such as Edison Scientific to projects inside Alphabet, Anthropic and OpenAI, are racing to automate scientific discovery from big data, particularly discoveries with commercial potential. 

But the tools they are building may well deliver answers without insight, prediction without compression. They bypass the step where complex phenomena get distilled into simpler general principles. And without that distillation, we’re no better off than we were before: We couldn’t understand the brain, and now we can’t understand the model of the brain either. 

To address this, a whole subfield, mechanistic interpretability, has sprung up just to crack open these models and find the principles inside. One recent paper fit one type of machine-learning model (a sparse autoencoder) to another type (a transformer trained on calcium imaging) to search for human-interpretable features inside the first model. This is the irony of our new era: We now need scientific models of our scientific models.

Why do we need “compression”? Albert Einstein (perhaps apocryphally) said a good theory should be as simple as possible, but not simpler. A theory is a short description that accounts for a much larger body of observations: The Hodgkin-Huxley model reduced the spike to four variables; the ring attractor model reduced head-direction tuning to a single equation that also explained path integration. Because these theories are compressed, they fit in our heads and we can mentally simulate them. That capacity to mentally simulate a model is what produces the feeling of understanding. AI models have no such constraint. They can fit vastly more into their “heads,” which means their internal models can be far less compressed and far less legible to us. AI doesn’t need to understand. 

This raises an uncomfortable question. If AI delivers accurate predictions, and those predictions lead to drugs that cross the blood-brain barrier and stimulation patterns that suppress seizures, does understanding still have a place? Of course, many people—not just scientists—value understanding. So did the Medici family, who bankrolled Galileo not because they needed better navigation tables but because they wanted a court philosopher. Science began as a lapdog of the wealthy, funded for the same reasons as art and music: because its patrons found it beautiful. But that’s not how we earn our keep anymore. The annual budget of the U.S. National Endowment for the Arts hovers around $200 million; the combined budgets of National Institutes of Health and the National Science Foundation top $50 billion. Society doesn’t fund science at 250 times the level of the arts because it finds understanding beautiful. The warm glow of understanding is a scientist’s private reward, not society’s rationale for writing the check. If predictions and their downstream benefits arrive without understanding, the public will likely be fine with that.

A recent paper suggests we shouldn’t surrender so quickly.

Keyon Vafa and his colleagues simulated tens of millions of planetary orbits from Newton’s laws and trained a transformer on the resulting sequences. The model predicted future positions with very high accuracy. But when they fine-tuned it to infer the underlying gravitational force vectors, the model produced nonsense. The implied laws of gravitation were different depending on which subset of data the researchers examined. The transformer had assembled a patchwork of heuristics, accurate for every solar system in the training set, but it hadn’t discovered the universal gravitational principle. Without that principle, the model could predict the movement of points of light in the sky but could never send a rocket to the moon.

A transformer trained on motor cortex recordings might predict held-out firing rates beautifully but fail to tell us what the circuit is actually computing. The Ptolemaic astronomers had a similar problem. Their geocentric model predicted planetary positions with impressive precision for a thousand years by stacking epicycles. The prior was theological: God’s cosmos must move in perfect circles. When Newton finally replaced it, predictive accuracy barely improved. What changed was the compression: A single law explained every orbit, and also falling apples, and ocean tides. Indeed, the transformer is even less principled than Ptolemy; Ptolemy at least had a prior. 

Understanding generalizes in ways that prediction alone cannot. David Hubel and Torsten Wiesel’s discovery of oriented receptive fields in V1 didn’t just describe the activity of a set of neurons; it gave us feature detection hierarchies, a framework that generalized across sensory cortices and inspired the convolutional neural networks that now power computer vision. Drift-diffusion models of decision-making started in psychophysics and ended up explaining single-neuron ramping activity in the lateral intraparietal area. That kind of creative leap, from one domain to another, is what compression buys you. A model that has memorized input-output relationships without learning the underlying structure will never make that leap.

So where does this leave us? The compression step, collapsing a sprawling dataset into something portable and teachable, remains (for now, at least) a human activity. AI models can predict; they have not yet learned to explain. But the Vafa result suggests that the drive toward understanding isn’t just a scientist’s vanity. Even in the age of large AI models, it may remain our most important job.

Sign up for our weekly newsletter.

Catch up on what you missed from our recent coverage, and get breaking news alerts.