Exposure to different kinds of music influences how the brain interprets rhythm

When listening to music, the human brain appears to be biased toward hearing and producing rhythms composed of simple integer ratios — for example, a series of four beats separated by equal time intervals (forming a 1:1:1 ratio).

However, the favored ratios can vary greatly between different societies, according to a large-scale study led by researchers at MIT and the Max Planck Institute for Empirical Aesthetics and carried out in 15 countries. The study included 39 groups of participants, many of whom came from societies whose traditional music contains distinctive patterns of rhythm not found in Western music.

“Our study provides the clearest evidence yet for some degree of universality in music perception and cognition, in the sense that every single group of participants that was tested exhibits biases for integer ratios. It also provides a glimpse of the variation that can occur across cultures, which can be quite substantial,” says Nori Jacoby, the study’s lead author and a former MIT postdoc, who is now a research group leader at the Max Planck Institute for Empirical Aesthetics in Frankfurt, Germany.

The brain’s bias toward simple integer ratios may have evolved as a natural error-correction system that makes it easier to maintain a consistent body of music, which human societies often use to transmit information.

“When people produce music, they often make small mistakes. Our results are consistent with the idea that our mental representation is somewhat robust to those mistakes, but it is robust in a way that pushes us toward our preexisting ideas of the structures that should be found in music,” says Josh McDermott, an associate professor of brain and cognitive sciences at MIT and a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds, and Machines.

McDermott is the senior author of the study, which appears today in Nature Human Behaviour. The research team also included scientists from more than two dozen institutions around the world.

A global approach

The new study grew out of a smaller analysis that Jacoby and McDermott published in 2017. In that paper, the researchers compared rhythm perception in groups of listeners from the United States and the Tsimane’, an Indigenous society located in the Bolivian Amazon rainforest.

pitch perception study
Nori Jacoby, a former MIT postdoc now at the Max Planck Institute for Empirical Aesthetics, runs an experiment with a member of the Tsimane’ tribe, who have had little exposure to Western music. Photo: Josh McDermott

To measure how people perceive rhythm, the researchers devised a task in which they play a randomly generated series of four beats and then ask the listener to tap back what they heard. The rhythm produced by the listener is then played back to the listener, and they tap it back again. Over several iterations, the tapped sequences became dominated by the listener’s internal biases, also known as priors.

“The initial stimulus pattern is random, but at each iteration the pattern is pushed by the listener’s biases, such that it tends to converge to a particular point in the space of possible rhythms,” McDermott says. “That can give you a picture of what we call the prior, which is the set of internal implicit expectations for rhythms that people have in their heads.”

When the researchers first did this experiment, with American college students as the test subjects, they found that people tended to produce time intervals that are related by simple integer ratios. Furthermore, most of the rhythms they produced, such as those with ratios of 1:1:2 and 2:3:3, are commonly found in Western music.

The researchers then went to Bolivia and asked members of the Tsimane’ society to perform the same task. They found that Tsimane’ also produced rhythms with simple integer ratios, but their preferred ratios were different and appeared to be consistent with those that have been documented in the few existing records of Tsimane’ music.

“At that point, it provided some evidence that there might be very widespread tendencies to favor these small integer ratios, and that there might be some degree of cross-cultural variation. But because we had just looked at this one other culture, it really wasn’t clear how this was going to look at a broader scale,” Jacoby says.

To try to get that broader picture, the MIT team began seeking collaborators around the world who could help them gather data on a more diverse set of populations. They ended up studying listeners from 39 groups, representing 15 countries on five continents — North America, South America, Europe, Africa, and Asia.

“This is really the first study of its kind in the sense that we did the same experiment in all these different places, with people who are on the ground in those locations,” McDermott says. “That hasn’t really been done before at anything close to this scale, and it gave us an opportunity to see the degree of variation that might exist around the world.”

A grid of nine different photos showing a researcher working with an individual at a table. The individuals are wearing headphones.
Example testing sites. a, Yaranda, Bolivia. b, Montevideo, Uruguay. c, Sagele, Mali. d, Spitzkoppe, Namibia. e, Pleven, Bulgaria. f, Bamako, Mali. g, D’Kar, Botswana. h, Stockholm, Sweden. i, Guizhou, China. j, Mumbai, India. Verbal informed consent was obtained from the individuals in each photo.

Cultural comparisons

Just as they had in their original 2017 study, the researchers found that in every group they tested, people tended to be biased toward simple integer ratios of rhythm. However, not every group showed the same biases. People from North America and Western Europe, who have likely been exposed to the same kinds of music, were more likely to generate rhythms with the same ratios. However, many groups, for example those in Turkey, Mali, Bulgaria, and Botswana showed a bias for other rhythms.

“There are certain cultures where there are particular rhythms that are prominent in their music, and those end up showing up in the mental representation of rhythm,” Jacoby says.

The researchers believe their findings reveal a mechanism that the brain uses to aid in the perception and production of music.

“When you hear somebody playing something and they have errors in their performance, you’re going to mentally correct for those by mapping them onto where you implicitly think they ought to be,” McDermott says. “If you didn’t have something like this, and you just faithfully represented what you heard, these errors might propagate and make it much harder to maintain a musical system.”

Among the groups that they studied, the researchers took care to include not only college students, who are easy to study in large numbers, but also people living in traditional societies, who are more difficult to reach. Participants from those more traditional groups showed significant differences from college students living in the same countries, and from people who live in those countries but performed the test online.

“What’s very clear from the paper is that if you just look at the results from undergraduate students around the world, you vastly underestimate the diversity that you see otherwise,” Jacoby says. “And the same was true of experiments where we tested groups of people online in Brazil and India, because you’re dealing with people who have internet access and presumably have more exposure to Western music.”

The researchers now hope to run additional studies of different aspects of music perception, taking this global approach.

“If you’re just testing college students around the world or people online, things look a lot more homogenous. I think it’s very important for the field to realize that you actually need to go out into communities and run experiments there, as opposed to taking the low-hanging fruit of running studies with people in a university or on the internet,” McDermott says.

The research was funded by the James S. McDonnell Foundation, the Canadian National Science and Engineering Research Council, the South African National Research Foundation, the United States National Science Foundation, the Chilean National Research and Development Agency, the Austrian Academy of Sciences, the Japan Society for the Promotion of Science, the Keio Global Research Institute, the United Kingdom Arts and Humanities Research Council, the Swedish Research Council, and the John Fell Fund.

Deep neural networks show promise as models of human hearing

Computational models that mimic the structure and function of the human auditory system could help researchers design better hearing aids, cochlear implants, and brain-machine interfaces. A new study from MIT has found that modern computational models derived from machine learning are moving closer to this goal.

In the largest study yet of deep neural networks that have been trained to perform auditory tasks, the MIT team showed that most of these models generate internal representations that share properties of representations seen in the human brain when people are listening to the same sounds.

The study also offers insight into how to best train this type of model: The researchers found that models trained on auditory input including background noise more closely mimic the activation patterns of the human auditory cortex.

“What sets this study apart is it is the most comprehensive comparison of these kinds of models to the auditory system so far. The study suggests that models that are derived from machine learning are a step in the right direction, and it gives us some clues as to what tends to make them better models of the brain,” says Josh McDermott, an associate professor of brain and cognitive sciences at MIT, a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds, and Machines, and the senior author of the study.

MIT graduate student Greta Tuckute and Jenelle Feather PhD ’22 are the lead authors of the open-access paper, which appears today in PLOS Biology.

Models of hearing

Deep neural networks are computational models that consists of many layers of information-processing units that can be trained on huge volumes of data to perform specific tasks. This type of model has become widely used in many applications, and neuroscientists have begun to explore the possibility that these systems can also be used to describe how the human brain performs certain tasks.

“These models that are built with machine learning are able to mediate behaviors on a scale that really wasn’t possible with previous types of models, and that has led to interest in whether or not the representations in the models might capture things that are happening in the brain,” Tuckute says.

When a neural network is performing a task, its processing units generate activation patterns in response to each audio input it receives, such as a word or other type of sound. Those model representations of the input can be compared to the activation patterns seen in fMRI brain scans of people listening to the same input.

In 2018, McDermott and then-graduate student Alexander Kell reported that when they trained a neural network to perform auditory tasks (such as recognizing words from an audio signal), the internal representations generated by the model showed similarity to those seen in fMRI scans of people listening to the same sounds.

Since then, these types of models have become widely used, so McDermott’s research group set out to evaluate a larger set of models, to see if the ability to approximate the neural representations seen in the human brain is a general trait of these models.

For this study, the researchers analyzed nine publicly available deep neural network models that had been trained to perform auditory tasks, and they also created 14 models of their own, based on two different architectures. Most of these models were trained to perform a single task — recognizing words, identifying the speaker, recognizing environmental sounds, and identifying musical genre — while two of them were trained to perform multiple tasks.

When the researchers presented these models with natural sounds that had been used as stimuli in human fMRI experiments, they found that the internal model representations tended to exhibit similarity with those generated by the human brain. The models whose representations were most similar to those seen in the brain were models that had been trained on more than one task and had been trained on auditory input that included background noise.

“If you train models in noise, they give better brain predictions than if you don’t, which is intuitively reasonable because a lot of real-world hearing involves hearing in noise, and that’s plausibly something the auditory system is adapted to,” Feather says.

Hierarchical processing

The new study also supports the idea that the human auditory cortex has some degree of hierarchical organization, in which processing is divided into stages that support distinct computational functions. As in the 2018 study, the researchers found that representations generated in earlier stages of the model most closely resemble those seen in the primary auditory cortex, while representations generated in later model stages more closely resemble those generated in brain regions beyond the primary cortex.

Additionally, the researchers found that models that had been trained on different tasks were better at replicating different aspects of audition. For example, models trained on a speech-related task more closely resembled speech-selective areas.

“Even though the model has seen the exact same training data and the architecture is the same, when you optimize for one particular task, you can see that it selectively explains specific tuning properties in the brain,” Tuckute says.

McDermott’s lab now plans to make use of their findings to try to develop models that are even more successful at reproducing human brain responses. In addition to helping scientists learn more about how the brain may be organized, such models could also be used to help develop better hearing aids, cochlear implants, and brain-machine interfaces.

“A goal of our field is to end up with a computer model that can predict brain responses and behavior. We think that if we are successful in reaching that goal, it will open a lot of doors,” McDermott says.

The research was funded by the National Institutes of Health, an Amazon Fellowship from the Science Hub, an International Doctoral Fellowship from the American Association of University Women, an MIT Friends of McGovern Institute Fellowship, a fellowship from the K. Lisa Yang Integrative Computational Neuroscience (ICoN) Center at MIT, and a Department of Energy Computational Science Graduate Fellowship.

Study: Deep neural networks don’t see the world the way we do

Human sensory systems are very good at recognizing objects that we see or words that we hear, even if the object is upside down or the word is spoken by a voice we’ve never heard.

Computational models known as deep neural networks can be trained to do the same thing, correctly identifying an image of a dog regardless of what color its fur is, or a word regardless of the pitch of the speaker’s voice. However, a new study from MIT neuroscientists has found that these models often also respond the same way to images or words that have no resemblance to the target.

When these neural networks were used to generate an image or a word that they responded to in the same way as a specific natural input, such as a picture of a bear, most of them generated images or sounds that were unrecognizable to human observers. This suggests that these models build up their own idiosyncratic “invariances” — meaning that they respond the same way to stimuli with very different features.

The findings offer a new way for researchers to evaluate how well these models mimic the organization of human sensory perception, says Josh McDermott, an associate professor of brain and cognitive sciences at MIT and a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds, and Machines.

“This paper shows that you can use these models to derive unnatural signals that end up being very diagnostic of the representations in the model,” says McDermott, who is the senior author of the study. “This test should become part of a battery of tests that we as a field are using to evaluate models.”

Jenelle Feather PhD ’22, who is now a research fellow at the Flatiron Institute Center for Computational Neuroscience, is the lead author of the open-access paper, which appears today in Nature Neuroscience. Guillaume Leclerc, an MIT graduate student, and Aleksander Mądry, the Cadence Design Systems Professor of Computing at MIT, are also authors of the paper.

Different perceptions

In recent years, researchers have trained deep neural networks that can analyze millions of inputs (sounds or images) and learn common features that allow them to classify a target word or object roughly as accurately as humans do. These models are currently regarded as the leading models of biological sensory systems.

It is believed that when the human sensory system performs this kind of classification, it learns to disregard features that aren’t relevant to an object’s core identity, such as how much light is shining on it or what angle it’s being viewed from. This is known as invariance, meaning that objects are perceived to be the same even if they show differences in those less important features.

“Classically, the way that we have thought about sensory systems is that they build up invariances to all those sources of variation that different examples of the same thing can have,” Feather says. “An organism has to recognize that they’re the same thing even though they show up as very different sensory signals.”

The researchers wondered if deep neural networks that are trained to perform classification tasks might develop similar invariances. To try to answer that question, they used these models to generate stimuli that produce the same kind of response within the model as an example stimulus given to the model by the researchers.

They term these stimuli “model metamers,” reviving an idea from classical perception research whereby stimuli that are indistinguishable to a system can be used to diagnose its invariances. The concept of metamers was originally developed in the study of human perception to describe colors that look identical even though they are made up of different wavelengths of light.

To their surprise, the researchers found that most of the images and sounds produced in this way looked and sounded nothing like the examples that the models were originally given. Most of the images were a jumble of random-looking pixels, and the sounds resembled unintelligible noise. When researchers showed the images to human observers, in most cases the humans did not classify the images synthesized by the models in the same category as the original target example.

“They’re really not recognizable at all by humans. They don’t look or sound natural and they don’t have interpretable features that a person could use to classify an object or word,” Feather says.

The findings suggest that the models have somehow developed their own invariances that are different from those found in human perceptual systems. This causes the models to perceive pairs of stimuli as being the same despite their being wildly different to a human.

Idiosyncratic invariances

The researchers found the same effect across many different vision and auditory models. However, each of these models appeared to develop their own unique invariances. When metamers from one model were shown to another model, the metamers were just as unrecognizable to the second model as they were to human observers.

“The key inference from that is that these models seem to have what we call idiosyncratic invariances,” McDermott says. “They have learned to be invariant to these particular dimensions in the stimulus space, and it’s model-specific, so other models don’t have those same invariances.”

The researchers also found that they could induce a model’s metamers to be more recognizable to humans by using an approach called adversarial training. This approach was originally developed to combat another limitation of object recognition models, which is that introducing tiny, almost imperceptible changes to an image can cause the model to misrecognize it.

The researchers found that adversarial training, which involves including some of these slightly altered images in the training data, yielded models whose metamers were more recognizable to humans, though they were still not as recognizable as the original stimuli. This improvement appears to be independent of the training’s effect on the models’ ability to resist adversarial attacks, the researchers say.

“This particular form of training has a big effect, but we don’t really know why it has that effect,” Feather says. “That’s an area for future research.”

Analyzing the metamers produced by computational models could be a useful tool to help evaluate how closely a computational model mimics the underlying organization of human sensory perception systems, the researchers say.

“This is a behavioral test that you can run on a given model to see whether the invariances are shared between the model and human observers,” Feather says. “It could also be used to evaluate how idiosyncratic the invariances are within a given model, which could help uncover potential ways to improve our models in the future.”

The research was funded by the National Science Foundation, the National Institutes of Health, a Department of Energy Computational Science Graduate Fellowship, and a Friends of the McGovern Institute Fellowship.

Understanding reality through algorithms

Although Fernanda De La Torre still has several years left in her graduate studies, she’s already dreaming big when it comes to what the future has in store for her.

“I dream of opening up a school one day where I could bring this world of understanding of cognition and perception into places that would never have contact with this,” she says.

It’s that kind of ambitious thinking that’s gotten De La Torre, a doctoral student in MIT’s Department of Brain and Cognitive Sciences, to this point. A recent recipient of the prestigious Paul and Daisy Soros Fellowship for New Americans, De La Torre has found at MIT a supportive, creative research environment that’s allowed her to delve into the cutting-edge science of artificial intelligence. But she’s still driven by an innate curiosity about human imagination and a desire to bring that knowledge to the communities in which she grew up.

An unconventional path to neuroscience

De La Torre’s first exposure to neuroscience wasn’t in the classroom, but in her daily life. As a child, she watched her younger sister struggle with epilepsy. At 12, she crossed into the United States from Mexico illegally to reunite with her mother, exposing her to a whole new language and culture. Once in the States, she had to grapple with her mother’s shifting personality in the midst of an abusive relationship. “All of these different things I was seeing around me drove me to want to better understand how psychology works,” De La Torre says, “to understand how the mind works, and how it is that we can all be in the same environment and feel very different things.”

But finding an outlet for that intellectual curiosity was challenging. As an undocumented immigrant, her access to financial aid was limited. Her high school was also underfunded and lacked elective options. Mentors along the way, though, encouraged the aspiring scientist, and through a program at her school, she was able to take community college courses to fulfill basic educational requirements.

It took an inspiring amount of dedication to her education, but De La Torre made it to Kansas State University for her undergraduate studies, where she majored in computer science and math. At Kansas State, she was able to get her first real taste of research. “I was just fascinated by the questions they were asking and this entire space I hadn’t encountered,” says De La Torre of her experience working in a visual cognition lab and discovering the field of computational neuroscience.

Although Kansas State didn’t have a dedicated neuroscience program, her research experience in cognition led her to a machine learning lab led by William Hsu, a computer science professor. There, De La Torre became enamored by the possibilities of using computation to model the human brain. Hsu’s support also convinced her that a scientific career was a possibility. “He always made me feel like I was capable of tackling big questions,” she says fondly.

With the confidence imparted in her at Kansas State, De La Torre came to MIT in 2019 as a post-baccalaureate student in the lab of Tomaso Poggio, the Eugene McDermott Professor of Brain and Cognitive Sciences and an investigator at the McGovern Institute for Brain Research. With Poggio, also the director of the Center for Brains, Minds and Machines, De La Torre began working on deep-learning theory, an area of machine learning focused on how artificial neural networks modeled on the brain can learn to recognize patterns and learn.

“It’s a very interesting question because we’re starting to use them everywhere,” says De La Torre of neural networks, listing off examples from self-driving cars to medicine. “But, at the same time, we don’t fully understand how these networks can go from knowing nothing and just being a bunch of numbers to outputting things that make sense.”

Her experience as a post-bac was De La Torre’s first real opportunity to apply the technical computer skills she developed as an undergraduate to neuroscience. It was also the first time she could fully focus on research. “That was the first time that I had access to health insurance and a stable salary. That was, in itself, sort of life-changing,” she says. “But on the research side, it was very intimidating at first. I was anxious, and I wasn’t sure that I belonged here.”

Fortunately, De La Torre says she was able to overcome those insecurities, both through a growing unabashed enthusiasm for the field and through the support of Poggio and her other colleagues in MIT’s Department of Brain and Cognitive Sciences. When the opportunity came to apply to the department’s PhD program, she jumped on it. “It was just knowing these kinds of mentors are here and that they cared about their students,” says De La Torre of her decision to stay on at MIT for graduate studies. “That was really meaningful.”

Expanding notions of reality and imagination

In her two years so far in the graduate program, De La Torre’s work has expanded the understanding of neural networks and their applications to the study of the human brain. Working with Guangyu Robert Yang, an associate investigator at the McGovern Institute and an assistant professor in the departments of Brain and Cognitive Sciences and Electrical Engineering and Computer Sciences, she’s engaged in what she describes as more philosophical questions about how one develops a sense of self as an independent being. She’s interested in how that self-consciousness develops and why it might be useful.

De La Torre’s primary advisor, though, is Professor Josh McDermott, who leads the Laboratory for Computational Audition. With McDermott, De La Torre is attempting to understand how the brain integrates vision and sound. While combining sensory inputs may seem like a basic process, there are many unanswered questions about how our brains combine multiple signals into a coherent impression, or percept, of the world. Many of the questions are raised by audiovisual illusions in which what we hear changes what we see. For example, if one sees a video of two discs passing each other, but the clip contains the sound of a collision, the brain will perceive that the discs are bouncing off, rather than passing through each other. Given an ambiguous image, that simple auditory cue is all it takes to create a different perception of reality.

There’s something interesting happening where our brains are receiving two signals telling us different things and, yet, we have to combine them somehow to make sense of the world.

De La Torre is using behavioral experiments to probe how the human brain makes sense of multisensory cues to construct a particular perception. To do so, she’s created various scenes of objects interacting in 3D space over different sounds, asking research participants to describe characteristics of the scene. For example, in one experiment, she combines visuals of a block moving across a surface at different speeds with various scraping sounds, asking participants to estimate how rough the surface is. Eventually she hopes to take the experiment into virtual reality, where participants will physically push blocks in response to how rough they perceive the surface to be, rather than just reporting on what they experience.

Once she’s collected data, she’ll move into the modeling phase of the research, evaluating whether multisensory neural networks perceive illusions the way humans do. “What we want to do is model exactly what’s happening,” says De La Torre. “How is it that we’re receiving these two signals, integrating them and, at the same time, using all of our prior knowledge and inferences of physics to really make sense of the world?”

Although her two strands of research with Yang and McDermott may seem distinct, she sees clear connections between the two. Both projects are about grasping what artificial neural networks are capable of and what they tell us about the brain. At a more fundamental level, she says that how the brain perceives the world from different sensory cues might be part of what gives people a sense of self. Sensory perception is about constructing a cohesive, unitary sense of the world from multiple sources of sensory data. Similarly, she argues, “the sense of self is really a combination of actions, plans, goals, emotions, all of these different things that are components of their own, but somehow create a unitary being.”

It’s a fitting sentiment for De La Torre, who has been working to make sense of and integrate different aspects of her own life. Working in the Computational Audition lab, for example, she’s started experimenting with combining electronic music with folk music from her native Mexico, connecting her “two worlds,” as she says. Having the space to undertake those kinds of intellectual explorations, and colleagues who encourage it, is one of De La Torre’s favorite parts of MIT.

“Beyond professors, there’s also a lot of students whose way of thinking just amazes me,” she says. “I see a lot of goodness and excitement for science and a little bit of — it’s not nerdiness, but a love for very niche things — and I just kind of love that.”

How do illusions trick the brain?

As part of our Ask the Brain series, Jarrod Hicks, a graduate student in Josh McDermott‘s lab and Dana Boebinger, a postdoctoral researcher at the University of Rochester (and former graduate student in Josh McDermott’s lab), answer the question, “How do illusions trick the brain?”


Graduate student Jarrod Hicks studies how the brain processes sound. Photo: M.E. Megan Hicks

Imagine you’re a detective. Your job is to visit a crime scene, observe some evidence, and figure out what happened. However, there are often multiple stories that could have produced the evidence you observe. Thus, to solve the crime, you can’t just rely on the evidence in front of you – you have to use your knowledge about the world to make your best guess about the most likely sequence of events. For example, if you discover cat hair at the crime scene, your prior knowledge about the world tells you it’s unlikely that a cat is the culprit. Instead, a more likely explanation is that the culprit might have a pet cat.

Although it might not seem like it, this kind of detective work is what your brain is doing all the time. As your senses send information to your brain about the world around you, your brain plays the role of detective, piecing together each bit of information to figure out what is happening in the world. The information from your senses usually paints a pretty good picture of things, but sometimes when this information is incomplete or unclear, your brain is left to fill in the missing pieces with its best guess of what should be there. This means that what you experience isn’t actually what’s out there in the world, but rather what your brain thinks is out there. The consequence of this is that your perception of the world can depend on your experience and assumptions.

Optical illusions

Optical illusions are a great way of showing how our expectations and assumptions affect what we perceive. For example, look at the squares labeled “A” and “B” in the image below.

Checkershadow illusion. Image: Edward H. Adelson

Is one of them lighter than the other? Although most people would agree that the square labeled “B” is much lighter than the one labeled “A,” the two squares are actually the exact same color. You perceive the squares differently because your brain knows, from experience, that shadows tend to make things appear darker than what they actually are. So, despite the squares being physically identical, your brain thinks “B” should be lighter.

Auditory illusions

Tricks of perception are not limited to optical illusions. There are also several dramatic examples of how our expectations influence what we hear. For example, listen to the mystery sound below. What do you hear?

Mystery sound

Because you’ve probably never heard a sound quite like this before, your brain has very little idea about what to expect. So, although you clearly hear something, it might be very difficult to make out exactly what that something is. This mystery sound is something called sine-wave speech, and what you’re hearing is essentially a very degraded sound of someone speaking.

Now listen to a “clean” version of this speech in the audio clip below:

Clean speech

You probably hear a person saying, “the floor was quite slippery.” Now listen to the mystery sound above again. After listening to the original audio, your brain has a strong expectation about what you should hear when you listen to the mystery sound again. Even though you’re hearing the exact same mystery sound as before, you experience it completely differently. (Audio clips courtesy of University of Sussex).


Dana Boebinger describes the science of illusions in this McGovern Minute.

Subjective perceptions

These illusions have been specifically designed by scientists to fool your brain and reveal principles of perception. However, there are plenty of real-life situations in which your perceptions strongly depend on expectations and assumptions. For example, imagine you’re watching TV when someone begins to speak to you from another room. Because the noise from the TV makes it difficult to hear the person speaking, your brain might have to fill in the gaps to understand what’s being said. In this case, different expectations about what is being said could cause you to hear completely different things.

Which phrase do you hear?

Listen to the clip below to hear a repeating loop of speech. As the sound plays, try to listen for one of the phrases listed in teal below.

Because the audio is somewhat ambiguous, the phrase you perceive depends on which phrase you listen for. So even though it’s the exact same audio each time, you can perceive something totally different! (Note: the original audio recording is from a football game in which the fans were chanting, “that is embarrassing!”)

Illusions like the ones above are great reminders of how subjective our perceptions can be. In order to make sense of the messy information coming in from our senses, our brains are constantly trying to fill in the blanks and with its best guess of what’s out there. Because of this guesswork, our perceptions depend on our experiences, leading each of us to perceive and interact with the world in a way that’s uniquely ours.

Jarrod Hicks is a PhD candidate in the Department of Brain and Cognitive Sciences at MIT working with Josh McDermott in the Laboratory for Computational Audition. He studies sound segregation, a key aspect of real-world hearing in which a sound source of interest is estimated amid a mixture of competing sources. He is broadly interested in teaching/outreach, psychophysics, computational approaches to represent stimulus spaces, and neural coding of high-level sensory representations.


Do you have a question for The Brain? Ask it here.

Three from MIT awarded 2022 Paul and Daisy Soros Fellowships for New Americans

MIT graduate student Fernanda De La Torre, alumna Trang Luu ’18, SM ’20, and senior Syamantak Payra are recipients of the 2022 Paul and Daisy Soros Fellowships for New Americans.

De La Torre, Luu, and Payra are among 30 New Americans selected from a pool of over 1,800 applicants. The fellowship honors the contributions of immigrants and children of immigrants by providing $90,000 in funding for graduate school.

Students interested in applying to the P.D. Soros Fellowship for future years may contact Kim Benard, associate dean of distinguished fellowships in Career Advising and Professional Development.

Fernanda De La Torre

Fernanda De La Torre is a PhD student in the Department of Brain and Cognitive Sciences. With Professor Josh McDermott, she studies how we integrate vision and sound, and with Professor Robert Yang, she develops computational models of imagination.

De La Torre spent her early childhood with her younger sister and grandmother in Guadalajara, Mexico. At age 12, she crossed the Mexican border to reunite with her mother in Kansas City, Missouri. Shortly after, an abusive home environment forced De La Torre to leave her family and support herself throughout her early teens.

Despite her difficult circumstances, De La Torre excelled academically in high school. By winning various scholarships that would discretely take applications from undocumented students, she was able to continue her studies in computer science and mathematics at Kansas State University. There, she became intrigued by the mysteries of the human mind. During college, De La Torre received invaluable mentorship from her former high school principal, Thomas Herrera, who helped her become documented through the Violence Against Women Act. Her college professor, William Hsu, supported her interests in artificial intelligence and encouraged her to pursue a scientific career.

After her undergraduate studies, De La Torre won a post-baccalaureate fellowship from the Department of Brain and Cognitive Sciences at MIT, where she worked with Professor Tomaso Poggio on the theory of deep learning. She then transitioned into the department’s PhD program. Beyond contributing to scientific knowledge, De La Torre plans to use science to create spaces where all people, including those from backgrounds like her own, can innovate and thrive.

She says: “Immigrants face many obstacles, but overcoming them gives us a unique strength: We learn to become resilient, while relying on friends and mentors. These experiences foster both the desire and the ability to pay it forward to our community.”

Trang Luu

Trang Luu graduated from MIT with a BS in mechanical engineering in 2018, and a master of engineering degree in 2020. Her Soros award will support her graduate studies at Harvard University in the MBA/MS engineering sciences program.

Born in Saigon, Vietnam, Luu was 3 when her family immigrated to Houston, Texas. Watching her parents’ efforts to make a living in a land where they did not understand the culture or speak the language well, Luu wanted to alleviate hardship for her family. She took full responsibility for her education and found mentors to help her navigate the American education system. At home, she assisted her family in making and repairing household items, which fueled her excitement for engineering.

As an MIT undergraduate, Luu focused on assistive technology projects, applying her engineering background to solve problems impeding daily living. These projects included a new adaptive socket liner for below-the-knee amputees in Kenya, Ethiopia, and Thailand; a walking stick adapter for wheelchairs; a computer head pointer for patients with limited arm mobility, a safer makeshift cook stove design for street vendors in South Africa; and a quicker method to test new drip irrigation designs. As a graduate student in MIT D-Lab under the direction of Professor Daniel Frey, Luu was awarded a National Science Foundation Graduate Research Fellowship. In her graduate studies, Luu researched methods to improve evaporative cooling devices for off-grid farmers to reduce rapid fruit and vegetable deterioration.

These projects strengthened Luu’s commitment to innovating new technology and devices for people struggling with basic daily tasks. During her senior year, Luu collaborated on developing a working prototype of a wearable device that noninvasively reduces hand tremors associated with Parkinson’s disease or essential tremor. Observing patients’ joy after their tremors stopped compelled Luu and three co-founders to continue developing the device after college. Four years later, Encora Therapeutics has accomplished major milestones, including Breakthrough Device designation by the U.S. Food and Drug Administration.

Syamantak Payra

Hailing from Houston, Texas, Syamantak Payra is a senior majoring in electrical engineering and computer science, with minors in public policy and entrepreneurship and innovation. He will be pursuing a PhD in engineering at Stanford University, with the goal of creating new biomedical devices that can help improve daily life for patients worldwide and enhance health care outcomes for decades to come.

Payra’s parents had emigrated from India, and he grew up immersed in his grandparents’ rich Bengali culture. As a high school student, he conducted projects with NASA engineers at Johnson Space Center, experimented at home with his scientist parents, and competed in spelling bees and science fairs across the United States. Through these avenues and activities, Syamantak not only gained perspectives on bridging gaps between people, but also found passions for language, scientific discovery, and teaching others.

After watching his grandmother struggle with asthma and chronic obstructive pulmonary disease and losing his baby brother to brain cancer, Payra devoted himself to trying to use technology to solve health-care challenges. Payra’s proudest accomplishments include building a robotic leg brace for his paralyzed teacher and conducting free literacy workshops and STEM outreach programs that reached nearly a thousand underprivileged students across the Greater Houston Area.

At MIT, Payra has worked in Professor Yoel Fink’s research laboratory, creating digital sensor fibers that have been woven into intelligent garments that can assist in diagnosing illnesses, and in Professor Joseph Paradiso’s research laboratory, where he contributed to next-generation spacesuit prototypes that better protect astronauts on spacewalks. Payra’s research has been published by multiple scientific journals, and he was inducted into the National Gallery of America’s Young Inventors.

Where did that sound come from?

The human brain is finely tuned not only to recognize particular sounds, but also to determine which direction they came from. By comparing differences in sounds that reach the right and left ear, the brain can estimate the location of a barking dog, wailing fire engine, or approaching car.

MIT neuroscientists have now developed a computer model that can also perform that complex task. The model, which consists of several convolutional neural networks, not only performs the task as well as humans do, it also struggles in the same ways that humans do.

“We now have a model that can actually localize sounds in the real world,” says Josh McDermott, an associate professor of brain and cognitive sciences and a member of MIT’s McGovern Institute for Brain Research. “And when we treated the model like a human experimental participant and simulated this large set of experiments that people had tested humans on in the past, what we found over and over again is it the model recapitulates the results that you see in humans.”

Findings from the new study also suggest that humans’ ability to perceive location is adapted to the specific challenges of our environment, says McDermott, who is also a member of MIT’s Center for Brains, Minds, and Machines.

McDermott is the senior author of the paper, which appears today in Nature Human Behavior. The paper’s lead author is MIT graduate student Andrew Francl.

Modeling localization

When we hear a sound such as a train whistle, the sound waves reach our right and left ears at slightly different times and intensities, depending on what direction the sound is coming from. Parts of the midbrain are specialized to compare these slight differences to help estimate what direction the sound came from, a task also known as localization.

This task becomes markedly more difficult under real-world conditions — where the environment produces echoes and many sounds are heard at once.

Scientists have long sought to build computer models that can perform the same kind of calculations that the brain uses to localize sounds. These models sometimes work well in idealized settings with no background noise, but never in real-world environments, with their noises and echoes.

To develop a more sophisticated model of localization, the MIT team turned to convolutional neural networks. This kind of computer modeling has been used extensively to model the human visual system, and more recently, McDermott and other scientists have begun applying it to audition as well.

Convolutional neural networks can be designed with many different architectures, so to help them find the ones that would work best for localization, the MIT team used a supercomputer that allowed them to train and test about 1,500 different models. That search identified 10 that seemed the best-suited for localization, which the researchers further trained and used for all of their subsequent studies.

To train the models, the researchers created a virtual world in which they can control the size of the room and the reflection properties of the walls of the room. All of the sounds fed to the models originated from somewhere in one of these virtual rooms. The set of more than 400 training sounds included human voices, animal sounds, machine sounds such as car engines, and natural sounds such as thunder.

The researchers also ensured the model started with the same information provided by human ears. The outer ear, or pinna, has many folds that reflect sound, altering the frequencies that enter the ear, and these reflections vary depending on where the sound comes from. The researchers simulated this effect by running each sound through a specialized mathematical function before it went into the computer model.

“This allows us to give the model the same kind of information that a person would have,” Francl says.

After training the models, the researchers tested them in a real-world environment. They placed a mannequin with microphones in its ears in an actual room and played sounds from different directions, then fed those recordings into the models. The models performed very similarly to humans when asked to localize these sounds.

“Although the model was trained in a virtual world, when we evaluated it, it could localize sounds in the real world,” Francl says.

Similar patterns

The researchers then subjected the models to a series of tests that scientists have used in the past to study humans’ localization abilities.

In addition to analyzing the difference in arrival time at the right and left ears, the human brain also bases its location judgments on differences in the intensity of sound that reaches each ear. Previous studies have shown that the success of both of these strategies varies depending on the frequency of the incoming sound. In the new study, the MIT team found that the models showed this same pattern of sensitivity to frequency.

“The model seems to use timing and level differences between the two ears in the same way that people do, in a way that’s frequency-dependent,” McDermott says.

The researchers also showed that when they made localization tasks more difficult, by adding multiple sound sources played at the same time, the computer models’ performance declined in a way that closely mimicked human failure patterns under the same circumstances.

“As you add more and more sources, you get a specific pattern of decline in humans’ ability to accurately judge the number of sources present, and their ability to localize those sources,” Francl says. “Humans seem to be limited to localizing about three sources at once, and when we ran the same test on the model, we saw a really similar pattern of behavior.”

Because the researchers used a virtual world to train their models, they were also able to explore what happens when their model learned to localize in different types of unnatural conditions. The researchers trained one set of models in a virtual world with no echoes, and another in a world where there was never more than one sound heard at a time. In a third, the models were only exposed to sounds with narrow frequency ranges, instead of naturally occurring sounds.

When the models trained in these unnatural worlds were evaluated on the same battery of behavioral tests, the models deviated from human behavior, and the ways in which they failed varied depending on the type of environment they had been trained in. These results support the idea that the localization abilities of the human brain are adapted to the environments in which humans evolved, the researchers say.

The researchers are now applying this type of modeling to other aspects of audition, such as pitch perception and speech recognition, and believe it could also be used to understand other cognitive phenomena, such as the limits on what a person can pay attention to or remember, McDermott says.

The research was funded by the National Science Foundation and the National Institute on Deafness and Other Communication Disorders.

Perfecting pitch perception

New research from MIT neuroscientists suggest that natural soundscapes have shaped our sense of hearing, optimizing it for the kinds of sounds we most often encounter.

Mark Saddler, graduate fellow of the K. Lisa Yang Integrative Computational Neuroscience Center. Photo: Caitlin Cunningham

In a study reported December 14 in the journal Nature Communications, researchers led by McGovern Institute Associate Investigator Josh McDermott used computational modeling to explore factors that influence how humans hear pitch. Their model’s pitch perception closely resembled that of humans—but only when it was trained using music, voices, or other naturalistic sounds.

Humans’ ability to recognize pitch—essentially, the rate at which a sound repeats—gives melody to music and nuance to spoken language. Although this is arguably the best-studied aspect of human hearing, researchers are still debating which factors determine the properties of pitch perception, and why it is more acute for some types of sounds than others. McDermott, who is also an associate professor in MIT’s Department of Brain and Cognitive Sciences and an investigator with the Center for Brains Minds and Machines (CBMM), is particularly interested in understanding how our nervous system perceives pitch because cochlear implants, which send electrical signals about sound to the brain in people with profound deafness, don’t replicate this aspect of human hearing very well.

“Cochlear implants can do a pretty good job of helping people understand speech, especially if they’re in a quiet environment. But they really don’t reproduce the percept of pitch very well,” says Mark Saddler, a CBMM graduate student who co-led the project and an inaugural graduate fellow of the K. Lisa Yang Integrative Computational Neuroscience Center. “One of the reasons it’s important to understand the detailed basis of pitch perception in people with normal hearing is to try to get better insights into how we would reproduce that artificially in a prosthesis.”

Artificial hearing

Pitch perception begins in the cochlea, the snail-shaped structure in the inner ear where vibrations from sounds are transformed into electrical signals and relayed to the brain via the auditory nerve. The cochlea’s structure and function help determine how and what we hear. And although it hasn’t been possible to test this idea experimentally, McDermott’s team suspected our “auditory diet” might shape our hearing as well.

To explore how both our ears and our environment influence pitch perception, McDermott, Saddler and research assistant Ray Gonzalez built a computer model called a deep neural network. Neural networks are a type of machine learning model widely used in automatic speech recognition and other artificial intelligence applications. Although the structure of an artificial neural network coarsely resembles the connectivity of neurons in the brain, the models used in engineering applications don’t actually hear the same way humans do—so the team developed a new model to reproduce human pitch perception. Their approach combined an artificial neural network with an existing model of the mammalian ear, uniting the power of machine learning with insights from biology. “These new machine learning models are really the first that can be trained to do complex auditory tasks and actually do them well, at human levels of performance,” Saddler explains.

The researchers trained the neural network to estimate pitch by asking it to identify the repetition rate of sounds in a training set. This gave them the flexibility to change the parameters under which pitch perception developed. They could manipulate the types of sound they presented to the model, as well as the properties of the ear that processed those sounds before passing them on to the neural network.

When the model was trained using sounds that are important to humans, like speech and music, it learned to estimate pitch much as humans do. “We very nicely replicated many characteristics of human perception…suggesting that it’s using similar cues from the sounds and the cochlear representation to do the task,” Saddler says.

But when the model was trained using more artificial sounds or in the absence of any background noise, its behavior was very different. For example, Saddler says, “If you optimize for this idealized world where there’s never any competing sources of noise, you can learn a pitch strategy that seems to be very different from that of humans, which suggests that perhaps the human pitch system was really optimized to deal with cases where sometimes noise is obscuring parts of the sound.”

The team also found the timing of nerve signals initiated in the cochlea to be critical to pitch perception. In a healthy cochlea, McDermott explains, nerve cells fire precisely in time with the sound vibrations that reach the inner ear. When the researchers skewed this relationship in their model, so that the timing of nerve signals was less tightly correlated to vibrations produced by incoming sounds, pitch perception deviated from normal human hearing. 

McDermott says it will be important to take this into account as researchers work to develop better cochlear implants. “It does very much suggest that for cochlear implants to produce normal pitch perception, there needs to be a way to reproduce the fine-grained timing information in the auditory nerve,” he says. “Right now, they don’t do that, and there are technical challenges to making that happen—but the modeling results really pretty clearly suggest that’s what you’ve got to do.”

Data transformed

With the tools of modern neuroscience, data accumulates quickly. Recording devices listen in on the electrical conversations between neurons, picking up the voices of hundreds of cells at a time. Microscopes zoom in to illuminate the brain’s circuitry, capturing thousands of images of cells’ elaborately branched paths. Functional MRIs detect changes in blood flow to map activity within a person’s brain, generating a complete picture by compiling hundreds of scans.

“When I entered neuroscience about 20 years ago, data were extremely precious, and ideas, as the expression went, were cheap. That’s no longer true,” says McGovern Associate Investigator Ila Fiete. “We have an embarrassment of wealth in the data but lack sufficient conceptual and mathematical scaffolds to understand it.”

Fiete will lead the McGovern Institute’s new K. Lisa Yang Integrative Computational Neuroscience (ICoN) Center, whose scientists will create mathematical models and other computational tools to confront the current deluge of data and advance our understanding of the brain and mental health. The center, funded by a $24 million donation from philanthropist Lisa Yang, will take a uniquely collaborative approach to computational neuroscience, integrating data from MIT labs to explain brain function at every level, from the molecular to the behavioral.

“Driven by technologies that generate massive amounts of data, we are entering a new era of translational neuroscience research,” says Yang, whose philanthropic investment in MIT research now exceeds $130 million. “I am confident that the multidisciplinary expertise convened by this center will revolutionize how we synthesize this data and ultimately understand the brain in health and disease.”

Data integration

Fiete says computation is particularly crucial to neuroscience because the brain is so staggeringly complex. Its billions of neurons, which are themselves complicated and diverse, interact with one other through trillions of connections.

“Conceptually, it’s clear that all these interactions are going to lead to pretty complex things. And these are not going to be things that we can explain in stories that we tell,” Fiete says. “We really will need mathematical models. They will allow us to ask about what changes when we perturb one or several components — greatly accelerating the rate of discovery relative to doing those experiments in real brains.”

By representing the interactions between the components of a neural circuit, a model gives researchers the power to explore those interactions, manipulate them, and predict the circuit’s behavior under different conditions.

“You can observe these neurons in the same way that you would observe real neurons. But you can do even more, because you have access to all the neurons and you have access to all the connections and everything in the network,” explains computational neuroscientist and McGovern Associate Investigator Guangyu Robert Yang (no relation to Lisa Yang), who joined MIT as a junior faculty member in July 2021.

Many neuroscience models represent specific functions or parts of the brain. But with advances in computation and machine learning, along with the widespread availability of experimental data with which to test and refine models, “there’s no reason that we should be limited to that,” he says.

Robert Yang’s team at the McGovern Institute is working to develop models that integrate multiple brain areas and functions. “The brain is not just about vision, just about cognition, just about motor control,” he says. “It’s about all of these things. And all these areas, they talk to one another.” Likewise, he notes, it’s impossible to separate the molecules in the brain from their effects on behavior – although those aspects of neuroscience have traditionally been studied independently, by researchers with vastly different expertise.

The ICoN Center will eliminate the divides, bringing together neuroscientists and software engineers to deal with all types of data about the brain. To foster interdisciplinary collaboration, every postdoctoral fellow and engineer at the center will work with multiple faculty mentors. Working in three closely interacting scientific cores, fellows will develop computational technologies for analyzing molecular data, neural circuits, and behavior, such as tools to identify pat-terns in neural recordings or automate the analysis of human behavior to aid psychiatric diagnoses. These technologies will also help researchers model neural circuits, ultimately transforming data into knowledge and understanding.

“Lisa is focused on helping the scientific community realize its goals in translational research,” says Nergis Mavalvala, dean of the School of Science and the Curtis and Kathleen Marble Professor of Astrophysics. “With her generous support, we can accelerate the pace of research by connecting the data to the delivery of tangible results.”

Computational modeling

In its first five years, the ICoN Center will prioritize four areas of investigation: episodic memory and exploration, including functions like navigation and spatial memory; complex or stereotypical behavior, such as the perseverative behaviors associated with autism and obsessive-compulsive disorder; cognition and attention; and sleep. The goal, Fiete says, is to model the neuronal interactions that underlie these functions so that researchers can predict what will happen when something changes — when certain neurons become more active or when a genetic mutation is introduced, for example. When paired with experimental data from MIT labs, the center’s models will help explain not just how these circuits work, but also how they are altered by genes, the environment, aging, and disease.

These focus areas encompass circuits and behaviors often affected by psychiatric disorders and neurodegeneration, and models will give researchers new opportunities to explore their origins and potential treatment strategies. “I really think that the future of treating disorders of the mind is going to run through computational modeling,” says McGovern Associate Investigator Josh McDermott.

In McDermott’s lab, researchers are modeling the brain’s auditory circuits. “If we had a perfect model of the auditory system, we would be able to understand why when somebody loses their hearing, auditory abilities degrade in the very particular ways in which they degrade,” he says. Then, he says, that model could be used to optimize hearing aids by predicting how the brain would interpret sound altered in various ways by the device.

Similar opportunities will arise as researchers model other brain systems, McDermott says, noting that computational models help researchers grapple with a dauntingly vast realm of possibilities. “There’s lots of different ways the brain can be set up, and lots of different potential treatments, but there is a limit to the number of neuroscience or behavioral experiments you can run,” he says. “Doing experiments on a computational system is cheap, so you can explore the dynamics of the system in a very thorough way.”

The ICoN Center will speed the development of the computational tools that neuroscientists need, both for basic understanding of the brain and clinical advances. But Fiete hopes for a culture shift within neuroscience, as well. “There are a lot of brilliant students and postdocs who have skills that are mathematics and computational and modeling based,” she says. “I think once they know that there are these possibilities to collaborate to solve problems related to psychiatric disorders and how we think, they will see that this is an exciting place to apply their skills, and we can bring them in.”

Josh McDermott seeks to replicate the human auditory system

The human auditory system is a marvel of biology. It can follow a conversation in a noisy restaurant, learn to recognize words from languages we’ve never heard before, and identify a familiar colleague by their footsteps as they walk by our office.

So far, even the most sophisticated computational models cannot perform such tasks as well as the human auditory system, but MIT neuroscientist Josh McDermott hopes to change that. Achieving this goal would be a major step toward developing new ways to help people with hearing loss, says McDermott, who recently earned tenure in MIT’s Department of Brain and Cognitive Sciences.

“Our long-term goal is to build good predictive models of the auditory system,” McDermott says.

“If we were successful in that goal, then it would really transform our ability to make people hear better, because we could design a computer program to figure out what to do to incoming sound to make it easier to recognize what somebody said or where a sound is coming from.”

McDermott’s lab also explores how exposure to different types of music affects people’s music preferences and even how they perceive music. Such studies can help to reveal elements of sound perception that are “hardwired” into our brains, and other elements that are influenced by exposure to different kinds of sounds.

“We have found that there is cross-cultural variation in things that people had widely supposed were universal and possibly even innate,” McDermott says.

Sound perception

As an undergraduate at Harvard University, McDermott originally planned to study math and physics, but “I was very quickly seduced by the brain,” he says. At the time, Harvard did not offer a major in neuroscience, so McDermott created his own, with a focus on vision.

After earning a master’s degree from University College London, he came to MIT to do a PhD in brain and cognitive sciences. His focus was still on vision, which he studied with Ted Adelson, the John and Dorothy Wilson Professor of Vision Science, but he found himself increasingly interested in audition. He had always loved music, and around this time, he started working as a radio and club DJ. “I was spending a lot of time thinking about sound and why things sound the way they do,” he recalls.

To pursue his new interest, he served as a postdoc at the University of Minnesota, where he worked in a lab devoted to psychoacoustics — the study of how humans perceive sound. There, he studied auditory phenomena such as the “cocktail party effect,” or the ability to focus on a particular person’s voice while tuning out background noise. During another postdoc at New York University, he started working on computational models of the auditory system. That interest in computation is part of what drew him back to MIT as a faculty member, in 2013.

“The culture here surrounding brain and cognitive science really prioritizes and values computation, and that was a perspective that was important to me,” says McDermott, who is also a member of MIT’s McGovern Institute for Brain Research and the Center for Brains, Minds and Machines. “I knew that was the kind of work I really wanted to do in my lab, so it just felt like a natural environment for doing that work.”

One aspect of audition that McDermott’s lab focuses on is “auditory scene analysis,” which includes tasks such as inferring what events in the environment caused a particular sound, and determining where a particular sound came from. This requires the ability to disentangle sounds produced by different events or objects, and the ability to tease out the effects of the environment. For instance, a basketball bouncing on a hardwood floor in a gym makes a different sound than a basketball bouncing on an outdoor paved court.

“Sounds in the world have very particular properties, due to physics and the way that the world works,” McDermott says. “We believe that the brain internalizes those regularities, and you have models in your head of the way that sound is generated. When you hear something, you are performing an inference in that model to figure out what is likely to have happened that caused the sound.”

A better understanding of how the brain does this may eventually lead to new strategies to enhance human hearing, McDermott says.

“Hearing impairment is the most common sensory disorder. It affects almost everybody as they get older, and the treatments are OK, but they’re not great,” he says. “We’re eventually going to all have personalized hearing aids that we walk around with, and we just need to develop the right algorithms in order to tell them what to do. That’s something we’re actively working on.”

Music in the brain

About 10 years ago, when McDermott was a postdoc, he started working on cross-cultural studies of how the human brain perceives music. Richard Godoy, an anthropologist at Brandeis University, asked McDermott to join him for some studies of the Tsimane’ people, who live in the Amazon rainforest. Since then, McDermott and some of his students have gone to Bolivia most summers to study sound perception among the Tsimane’. The Tsimane’ have had very little exposure to Western music, making them ideal subjects to study how listening to certain kinds of music influences human sound perception.

These studies have revealed both differences and similarities between Westerners and the Tsimane’ people. McDermott, who counts soul, disco, and jazz-funk among his favorite types of music, has found that Westerners and the Tsimane’ differ in their perceptions of dissonance. To Western ears, for example, the chord of C and F# sounds very unpleasant, but not to the Tsimane’.

He has also shown that that people in Western society perceive sounds that are separated by an octave to be similar, but the Tsimane’ do not. However, there are also some similarities between the two groups. For example, the upper limit of frequencies that can be perceived appears to be the same regardless of music exposure.

“We’re finding both striking variation in some perceptual traits that many people presumed were common across cultures and listeners, and striking similarities in others,” McDermott says. “The similarities and differences across cultures dissociate aspects of perception that are tightly coupled in Westerners, helping us to parcellate perceptual systems into their underlying components.”