By clicking to watch this video, you agree to our privacy policy.
Stroke of luck: David Sussillo describes the inspiration behind his book, recounts how he ended up in Larry Abbott’s lab and discusses how the landscape of computational neuroscience at the time led to their scientific breakthroughs together.
At the 2022 Princeton Neuroscience Retreat, a nervous David Sussillo stepped up to the podium to give an unusual talk. This one was not about dynamical motifs or FORCE learning, but about his life. For the next 30 minutes, Sussillo described his parents’ substance abuse, growing up in orphanages and his uneven journey to getting a Ph.D. in neuroscience. It was a talk that made him realize that his life and work were inseparable.
In his new book, “Emergence: A Memoir of Boyhood, Computation, and the Mysteries of Mind,” published today, his life and work are intertwined once again. Sussillo candidly writes about his challenging boyhood and how it shaped his career. The Transmitter offers a short excerpt below.
In a Q&A with Transmitter associate editor Francisco J. Rivera Rosario, above, Sussillo discusses why he wrote the book, talks about working with Larry Abbott, and explains how his success comes from luck and persistence. In the following excerpt, Sussillo writes about getting kicked out of his first lab at Columbia University, how he ended up in Abbott’s lab, and their work developing FORCE learning.
Useful networks
After a blissful year in Austria, I returned to the US and the smoldering ruins of my PhD. Frankly, I was stumbling into my fourth year. By this point, a candidate should be cozied up in a lab and well on the way to deciding a question that’ll be the crown jewel of their dissertation. You should be kicking ass and taking scientific names. I was doing neither.
Desperate for guidance, I sent an email to John, the neuroscience department head, asking what he thought I should do. He replied that Columbia’s Center for Theoretical Neuroscience had finally come together and that he would try to find me a position with a professor there.
He reached out to Professor Larry Abbott, a physicist-turned-neuroscientist who had risen to prominence in the field during the previous decade. I later found out that Larry wasn’t overly thrilled about taking on a fourth-year reject from another lab, but John wouldn’t take no for an answer. Larry agreed to a three-month trial period.
And so, in arguably the greatest stroke of luck I’ve ever had, I blundered ass-backward into being advised by the one and only, the great Larry motherfuckin’ Abbott. Yet one more back door—barely ajar and closing fast—that luck or fate or vanishingly small odds allowed me to slip through.
Over the next few months, it became clear that when it came to science, Larry and I were practically reading each other’s minds. We were both intuitive, conceptual thinkers with similar ideas about computation and the brain. When my three-month trial ended, Larry welcomed me as one of his full-time students.
Before falling head over heels for biology and neuroscience, Larry had an entirely separate career in theoretical physics. In 1983, he published apaper in which he and his colleague suggested that the axion, a hypothetical subatomic particle that might be important for understanding how matter is organized, could be a major component of dark matter in the universe. The paper went unnoticed by the scientific community for a whopping three decades before finally gaining recognition due to several ongoing searches for dark matter. It’s now Larry’s most highly cited paper.
Despite his mathematical prowess, when it came to neuroscience, Larry always tested his neural network ideas in a real-time simulation he’d programmed on his computer. When he introduced this simulator to me, I got it immediately: Instead of playing Ms. Pac-Man, Larry “played” research. There were programmable buttons, sliders, moving graphs, and colorful charts with dots and lines and everything. If a simulation went belly-up, you just rewound time with a flick of the slider, tweaked a parameter or two, and bam! You could see the difference in the results right then and there. We were playing video games, but instead of chasing high scores, we were chasing scientific breakthroughs.
About a year into my mentorship, I asked him about this approach.
“Math simply isn’t enough,” he said, turning away from a whiteboard covered in nonlinear differential equations. “That,” he said, pointing his dry-erase marker at his beefy computer workstation, “is the only thing that separates us from hundreds of years of thinking about science.” Larry was in his mid-fifties, balding, and going gray. But when he got amped up about science, his energy made him seem decades younger.
“Another thing,” he continued. “Always pay attention to problems people say can’t be solved. This stuff just becomes dogma. Humans are lazy thinkers. Unless there’s mathematical proof, call bullshit on these blanket statements. Hell, even if there is proof, look for a loophole.”
His words proved prescient. Over the next two years, through a process of interactive research, simulation, and math, I discovered a solution to a well-known problem that would become the meat and potatoes of my PhD dissertation. In fact, we managed to pull something off that everyone thought was impossible. Together, Larry and I figured out how to train chaotic recurrent neural networks to actually do something useful. What does that mean?
Ever since the 1940s, when McCulloch and Pitts proposed thefirst mathematical model of a neuron, scientists dreamed of building networks that mimicked the brain’s incredible computing power. Their groundbreaking work showed that even simplified artificial neurons, when connected in the right way, could perform complex computations. This insight sparked an entirely new field dedicated to understanding artificial neural networks.
Over the years, researchers made significant progress with various models—from Hubel and Wiesel’s discoveries about how the visual cortex processes information through increasingly complex feedforward cascades, to Hopfield’s attractor networks that helped explain memory and pattern completion. These contributions built a foundation for understanding neural computation, but a core challenge remained unsolved.
Understanding the human brain requires us to move beyond feed-forward models of neural activity. Unlike a factory assembly line where products move neatly from station to station, our brains operate more like a bustling conversation, with information constantly looping back between neurons, evolving, and influencing ongoing processing. Neural networks, the mathematical models used to study brain function, come in two main varieties that mirror this distinction. Feed-forward networks function like the assembly line, with information moving in just one direction. We encountered these already when we discussed ALVINN, the neural network that powered the self-driving military truck I’d encountered at Carnegie Mellon. These networks are trainable using techniques like backpropagation. But recurrent networks, with their complex feedback loops that create a form of memory and allow consideration of context, more closely resemble actual brain architecture. These recurrent networks give our brains their remarkable adaptability and computational power, making them essential for realistic brain modeling. The catch was that despite decades of effort, nobody had figured out how to train them effectively. Backprop was not effective for training recurrent networks.
One of the major difficulties in understanding recurrent neural networks is that they are dynamical systems—just like the weather patterns and double pendulums I described earlier—and they can have incredibly complex—even chaotic—dynamics, just like the meteorologist Lorenz and his team discovered in the equations they used to model the weather in the 1960s. Remember, chaotic systems are those where you can’t predict how the past will shape the future due to sensitive dependence on initial conditions. To repeat the same behavior exactly in a chaotic system, you have to set its initial state with infinite precision, which is impossible in practice.
And this is what Larry and I figured out: how to train chaotic recurrent neural networks to do useful computations. Now, by the time I began my work with Larry in 2006, most AI researchers had abandoned neural networks altogether, and those who still worked with them had completely written off recurrent networks. Most neuroscientists believed that recurrent nets were the right choice for modeling brains via computer simulation, but nobody believed you could train them.
This is where Larry and I made our breakthrough, which we calledFORCE learning. Instead of trying to tame chaos directly, we found a way to work with it. Imagine a wild horse—that’s like a chaotic network. It’s got tremendous potential, but it’s running around unpredictably, not doing anything useful. Traditional training methods were like trying to saddle that mustang while it’s bucking and rearing. With FORCE learning, we took a different approach. We exposed the network to a strong, consistent pattern—like having a wild horse follow a lead mare walking a steady path. As the network synchronized with this pattern, we gently guided it toward the behaviors we wanted it to learn, gradually shifting its internal landscape without fighting against its inherent chaos.
What made this approach revolutionary was that it didn’t try to eliminate the chaotic nature of these networks. Rather, we harnessed it, because we knew the chaos provided the computational richness and flexibility that made these networks so powerful. By working with this chaos rather than against it, we found that recurrent networks could learn to generate precise patterns and solve complex problems that had previously seemed impossible. The wild horse could now perform skilled jumps or dressage—still wild at heart, but with the discipline to channel that energy into something purposeful.
The implications for neuroscience were significant. FORCE learning gave researchers the ability to build artificial networks that performed the same tasks as used in neuroscience experiments, such as determining which of two tones is higher in pitch or remembering sequences of lights. Researchers could then attempt to analyze how these artificial systems solved these problems, which could then be translated into hypotheses about real brain function. It would turn out that this approach would provide unprecedented neuroscientific insights, allowing scientists to generate testable hypotheses about the neural mechanisms underlying cognition and behavior.