When soundwaves reach the inner ear, neurons there pick up the vibrations and alert the brain. Encoded in their signals is a wealth of information that enables us to follow conversations, recognize familiar voices, appreciate music, and quickly locate a ringing phone or crying baby.
Neurons send signals by emitting spikes—brief changes in voltage that propagate along nerve fibers, also known as action potentials. Remarkably, auditory neurons can fire hundreds of spikes per second, and time their spikes with exquisite precision to match the oscillations of incoming soundwaves.
With powerful new models of human hearing, scientists at MIT’s McGovern Institute have determined that this precise timing is vital for some of the most important ways we make sense of auditory information, including recognizing voices and localizing sounds.
The findings, reported December 4, 2024, in the journal Nature Communications, show how machine learning can help neuroscientists understand how the brain uses auditory information in the real world. McGovern Investigator Josh McDermott, who led the research, explains that his team’s models better equip researchers to study the consequences of different types of hearing impairment and devise more effective interventions.
Science of sound
The nervous system’s auditory signals are timed so precisely, researchers have long suspected that timing is important to our perception of sound. Soundwaves oscillate at rates that determine their pitch: low-pitched sounds travel in slow waves, whereas high-pitched sound waves oscillate more frequently. The auditory nerve that relays information from sound-detecting hair cells in the ear to the brain generates electrical spikes that corresponds to the frequency of these oscillations. “The action potentials in an auditory nerve get fired at very particular points in time relative to the peaks in the stimulus waveform,” explains McDermott, who is also an associate professor of brain and cognitive sciences at MIT.
This relationship, known as phase-locking, requires neurons to time their spikes with sub-millisecond precision. But scientists haven’t really known how informative these temporal patterns are to the brain. Beyond being scientifically intriguing, McDermott says, the question has important clinical implications: “If you want to design a prosthesis that provides electrical signals to the brain to reproduce the function of the ear, it’s arguably pretty important to know what kinds of information in the normal ear actually matter,” he says.
This has been difficult to study experimentally: Animal models can’t offer much insight into how the human brain extracts structure in language or music, and the auditory nerve is inaccessible for study in humans. So McDermott and graduate student Mark Saddler turned to artificial neural networks.
Artificial hearing
Neuroscientists have long used computational models to explore how sensory information might be decoded by the brain, but until recent advances in computing power and machine learning methods, these models were limited to simulating simple tasks. “One of the problems with these prior models is that they’re often way too good,” says Saddler, who is now at the Technical University of Denmark. For example, a computational model tasked with identifying the higher pitch in a pair of simple tones is likely to perform better than people who are asked to do the same thing. “This is not the kind of task that we do every day in hearing,” Saddler points out. “The brain is not optimized to solve this very artificial task.” This mismatch limited the insights that could be drawn from this prior generation of models.
To better understand the brain, Saddler and McDermott wanted to challenge a hearing model to do things that people use their hearing for in the real world, like recognizing words and voices. That meant developing an artificial neural network to simulate the parts of the brain that receive input from the ear. The network was given input from some 32,000 simulated sound-detecting sensory neurons and then optimized for various real-world tasks.
The researchers showed that their model replicated human hearing well—better than any previous model of auditory behavior, McDermott says. In one test, the artificial neural network was asked to recognize words and voices within dozens of types of background noise, from the hum of an airplane cabin to enthusiastic applause. Under every condition, the model performed very similarly to humans.
“The ability to link patterns of firing in the auditory nerve with behavior opens a lot of doors.” – Josh McDermott
When the team degraded the timing of the spikes in the simulated ear, however, their model could no longer match humans’ ability to recognize voices or identify the locations of sounds. For example, while McDermott’s team had previously shown that people use pitch to help them identify people’s voices, the model revealed that that this ability is lost without precisely timed signals. “You need quite precise spike timing in order to both account for human behavior and to perform well on the task,” Saddler says. That suggests that the brain uses precisely timed auditory signals because they aid these practical aspects of hearing.
The team’s findings demonstrate how artificial neural networks can help neuroscientists understand how the information extracted by the ear influences our perception of the world, both when hearing is intact and when it is impaired. “The ability to link patterns of firing in the auditory nerve with behavior opens a lot of doors,” McDermott says.
“Now that we have these models that link neural responses in the ear to auditory behavior, we can ask, ‘If we simulate different types of hearing loss, what effect is that going to have on our auditory abilities?’” McDermott says. “That will help us better diagnose hearing loss, and we think there are also extensions of that to help us design better hearing aids or cochlear implants.” For example, he says, “The cochlear implant is limited in various ways—it can do some things and not others. What’s the best way to set up that cochlear implant to enable you to mediate behaviors? You can, in principle, use the models to tell you that.”