Neurons receive precisely tailored teaching signals as we learn

New work suggests that the brain can deliver neuron-specific feedback during learning—resembling the error signals that drive machine learning.

Man seated on staircase, smiling at camera — McGovern Investigator Mark Harnett. Photo: Adam Glanzman

When we learn a new skill, the brain has to decide—cell by cell—what to change. New research from MIT suggests it can do that with surprising precision, sending targeted feedback to individual neurons so each one can adjust its activity in the right direction.

The finding echoes a key idea from modern artificial intelligence. Many AI systems learn by comparing their output to a target, computing an “error” signal, and using it to fine-tune connections within the network. A longstanding question has been whether the brain also uses that kind of individualized feedback. In a study published in the February 25 issue of the journal Nature, MIT researchers report evidence that it does.

A research team led by Mark Harnett, a McGovern Institute investigator and associate professor in the Department of Brain and Cognitive Sciences at MIT, discovered these instructive signals in mice by training animals to control the activity of specific neurons using a brain-computer interface (BCI). Their approach, the researchers say, can be used to further study the relationships between artificial neural networks and real brains, in ways that are expected to both improve understanding of biological learning and enable better brain-inspired artificial intelligence.

The changing brain

Our brains are constantly changing as we interact with the world, modifying their circuitry as we learn and adapt. “We know a lot from 50 years of studies that there are many ways to change the strength of connections between neurons,” Harnett says. “What the field really lacks is a way of understanding how those changes are orchestrated to actually produce efficient learning.”

Some actions—and the neural connections that enable them—are reinforced with the release of neuromodulators like dopamine or norepinephrine in the brain. But those signals are broadcast to large groups of neurons, without discriminating between cells’ individual contributions to a failure or a success. “Reinforcement learning via neuromodulators works, but it’s inefficient, because all the neurons and all the synapses basically get only one signal,” Harnett says.

Machine learning uses an alternative, and extremely powerful, way to learn from mistakes. Using a method called backpropagation, artificial neural networks compute an error signal and use it to adjust their individual connections. They do this over and over, learning from experience how to fine-tune their networks for success. “It works really well and it’s computationally very effective,” Harnett says.

It seemed likely that brains might use similar error signals for learning. But neuroscientists were skeptical that brains would have the precision to send tailored signals to individual neurons due to the constraints imposed by using living cells and circuits instead of software and equations. A major problem for testing this idea was how to find the signals that provide personalized instructions to neurons, which are called vectorized instructive signals. The challenge, explains Valerio Francioni, first author of the Nature paper and a former postdoctoral researcher in Harnett’s lab, is that scientists don’t know how individual neurons contribute to specific behaviors.

“If I was recording your brain activity while you were learning to play piano,” Francioni explains, “I would learn that there is a correlation between the changes happening in your brain and you learning piano. But if you asked me to make you a better piano player by manipulating your brain activity, I would not be able to do that, because we don’t know how the activity of individual neurons map to that ultimate performance.”

Without knowing which neurons need to become more active and which ones should be reined in, it is impossible to look for signals directing those changes.

Brain-computer interface

To get around this problem, Harnett’s team developed a brain-computer interface task to directly link neural activity and reward outcome – akin to linking the keys of the piano directly to the activity of single neurons. To succeed at the task, certain neurons needed to increase their activity, whereas others were required to decrease their activity.

They set up a BCI to directly link activity in those neurons—just eight to ten of the millions of neurons in a mouse’s brain—to a visual readout, providing sensory feedback to the mice about their performance. Success was accompanied by delivery of a sugary reward.

“Now if you ask me, ‘How does the mouse get more rewards? Which neuron do you have to activate and which neuron do you have to inhibit?’ I know exactly what the answer to that question is,” says Francioni, whose work was supported by a Y. Eva Tan Fellowship from the Yang Tan Collective at MIT.

The scientists didn’t know the exact function of the particular neurons they linked to the BCI, but the cells were active enough that mice received occasional rewards whenever the signals happened to be right. Within a week, mice learned to switch on the right neurons while leaving the other set of neurons inactive, earning themselves more rewards.

Francioni monitored the target neurons daily during this learning process using a powerful microscope to visualize fluorescent indicators of neural activity. He zeroed in on the neurons’ branching dendrites, where the appropriate feedback signals have long been suspected to arrive. At the same time, he tracked activity in the parent cell bodies of those neurons. The team used these data to examine the relationship between signals received at a neuron’s dendrites and its activity, as well as how these changed when mice were rewarded for activating the right neurons or when they failed at their task.

Vectorized neural signals

They concluded that the two groups of neurons whose activity controlled the BCI in opposite ways, also received opposing error signals at their dendrites as the mice learned. Some were told to ramp up their activity during the task, while others were instructed to dial it down. What’s more, when the team manipulated the dendrites to inhibit these instructive signals, mice failed to learn the task. “This is the first biological evidence that vectorized [neuron-specific] signal-based instructive learning is taking place in the cortex,” Harnett says.

The discovery of vectorized signals in the brain—and the team’s ability to find them—should promote more back and forth between neuroscientists and machine learning researchers, says postdoctoral researcher Vincent Tang. “It provides further incentive for the machine learning community to keep developing models and proposing new hypotheses along this direction,” he says. “Then we can come back and test them.”

The researchers say they are just as excited about applying their approach to future experiments as they are about their current discovery.

“Machine learning offers a robust, mathematically tractable way to really study learning. The fact that we can now translate at least some of this directly into the brain is very powerful,” Francioni says.

Harnett says the approach opens new opportunities to investigate possible parallels between the brain and machine learning. “Now we can go after figuring out, how does cortex learn? How do other brain regions learn? How similar or how different is it to this particular algorithm? Can we figure out how to build better, more brain-inspired models from what we learn from the biology?” he says. “This feels like a really big new beginning.”

Paper: "Vectorized instructive signals in cortical dendrites during a brain-computer interface task"