Josh McDermott Archives - Page 2 of 3 - MIT McGovern Institute

Josh McDermott seeks to replicate the human auditory system

Anne Trafton | May 2, 2021May 24, 2023

The human auditory system is a marvel of biology. It can follow a conversation in a noisy restaurant, learn to recognize words from languages we’ve never heard before, and identify a familiar colleague by their footsteps as they walk by our office.

So far, even the most sophisticated computational models cannot perform such tasks as well as the human auditory system, but MIT neuroscientist Josh McDermott hopes to change that. Achieving this goal would be a major step toward developing new ways to help people with hearing loss, says McDermott, who recently earned tenure in MIT’s Department of Brain and Cognitive Sciences.

“Our long-term goal is to build good predictive models of the auditory system,” McDermott says.

“If we were successful in that goal, then it would really transform our ability to make people hear better, because we could design a computer program to figure out what to do to incoming sound to make it easier to recognize what somebody said or where a sound is coming from.”

McDermott’s lab also explores how exposure to different types of music affects people’s music preferences and even how they perceive music. Such studies can help to reveal elements of sound perception that are “hardwired” into our brains, and other elements that are influenced by exposure to different kinds of sounds.

“We have found that there is cross-cultural variation in things that people had widely supposed were universal and possibly even innate,” McDermott says.

Sound perception

As an undergraduate at Harvard University, McDermott originally planned to study math and physics, but “I was very quickly seduced by the brain,” he says. At the time, Harvard did not offer a major in neuroscience, so McDermott created his own, with a focus on vision.

After earning a master’s degree from University College London, he came to MIT to do a PhD in brain and cognitive sciences. His focus was still on vision, which he studied with Ted Adelson, the John and Dorothy Wilson Professor of Vision Science, but he found himself increasingly interested in audition. He had always loved music, and around this time, he started working as a radio and club DJ. “I was spending a lot of time thinking about sound and why things sound the way they do,” he recalls.

To pursue his new interest, he served as a postdoc at the University of Minnesota, where he worked in a lab devoted to psychoacoustics — the study of how humans perceive sound. There, he studied auditory phenomena such as the “cocktail party effect,” or the ability to focus on a particular person’s voice while tuning out background noise. During another postdoc at New York University, he started working on computational models of the auditory system. That interest in computation is part of what drew him back to MIT as a faculty member, in 2013.

“The culture here surrounding brain and cognitive science really prioritizes and values computation, and that was a perspective that was important to me,” says McDermott, who is also a member of MIT’s McGovern Institute for Brain Research and the Center for Brains, Minds and Machines. “I knew that was the kind of work I really wanted to do in my lab, so it just felt like a natural environment for doing that work.”

One aspect of audition that McDermott’s lab focuses on is “auditory scene analysis,” which includes tasks such as inferring what events in the environment caused a particular sound, and determining where a particular sound came from. This requires the ability to disentangle sounds produced by different events or objects, and the ability to tease out the effects of the environment. For instance, a basketball bouncing on a hardwood floor in a gym makes a different sound than a basketball bouncing on an outdoor paved court.

“Sounds in the world have very particular properties, due to physics and the way that the world works,” McDermott says. “We believe that the brain internalizes those regularities, and you have models in your head of the way that sound is generated. When you hear something, you are performing an inference in that model to figure out what is likely to have happened that caused the sound.”

A better understanding of how the brain does this may eventually lead to new strategies to enhance human hearing, McDermott says.

“Hearing impairment is the most common sensory disorder. It affects almost everybody as they get older, and the treatments are OK, but they’re not great,” he says. “We’re eventually going to all have personalized hearing aids that we walk around with, and we just need to develop the right algorithms in order to tell them what to do. That’s something we’re actively working on.”

Music in the brain

About 10 years ago, when McDermott was a postdoc, he started working on cross-cultural studies of how the human brain perceives music. Richard Godoy, an anthropologist at Brandeis University, asked McDermott to join him for some studies of the Tsimane’ people, who live in the Amazon rainforest. Since then, McDermott and some of his students have gone to Bolivia most summers to study sound perception among the Tsimane’. The Tsimane’ have had very little exposure to Western music, making them ideal subjects to study how listening to certain kinds of music influences human sound perception.

These studies have revealed both differences and similarities between Westerners and the Tsimane’ people. McDermott, who counts soul, disco, and jazz-funk among his favorite types of music, has found that Westerners and the Tsimane’ differ in their perceptions of dissonance. To Western ears, for example, the chord of C and F# sounds very unpleasant, but not to the Tsimane’.

He has also shown that that people in Western society perceive sounds that are separated by an octave to be similar, but the Tsimane’ do not. However, there are also some similarities between the two groups. For example, the upper limit of frequencies that can be perceived appears to be the same regardless of music exposure.

“We’re finding both striking variation in some perceptual traits that many people presumed were common across cultures and listeners, and striking similarities in others,” McDermott says. “The similarities and differences across cultures dissociate aspects of perception that are tightly coupled in Westerners, helping us to parcellate perceptual systems into their underlying components.”

Nine MIT School of Science professors receive tenure for 2020

by School of Science | July 7, 2020May 24, 2023

Beginning July 1, nine faculty members in the MIT School of Science have been granted tenure by MIT. They are appointed in the departments of Brain and Cognitive Sciences, Chemistry, Mathematics, and Physics.

Physicist Ibrahim Cisse investigates living cells to reveal and study collective behaviors and biomolecular phase transitions at the resolution of single molecules. The results of his work help determine how disruptions in genes can cause diseases like cancer. Cisse joined the Department of Physics in 2014 and now holds a joint appointment with the Department of Biology. His education includes a bachelor’s degree in physics from North Carolina Central University, concluded in 2004, and a doctoral degree in physics from the University of Illinois at Urbana-Champaign, achieved in 2009. He followed his PhD with a postdoc at the École Normale Supérieure of Paris and a research specialist appointment at the Howard Hughes Medical Institute’s Janelia Research Campus.

Jörn Dunkel is a physical applied mathematician. His research focuses on the mathematical description of complex nonlinear phenomena in a variety of fields, especially biophysics. The models he develops help predict dynamical behaviors and structure formation processes in developmental biology, fluid dynamics, and even knot strengths for sailing, rock climbing and construction. He joined the Department of Mathematics in 2013 after completing postdoctoral appointments at Oxford University and Cambridge University. He received diplomas in physics and mathematics from Humboldt University of Berlin in 2004 and 2005, respectively. The University of Augsburg awarded Dunkel a PhD in statistical physics in 2008.

A cognitive neuroscientist, Mehrdad Jazayeri studies the neurobiological underpinnings of mental functions such as planning, inference, and learning by analyzing brain signals in the lab and using theoretical and computational models, including artificial neural networks. He joined the Department of Brain and Cognitive Sciences in 2013. He achieved a BS in electrical engineering from the Sharif University of Technology in 1994, an MS in physiology at the University of Toronto in 2001, and a PhD in neuroscience from New York University in 2007. Prior to joining MIT, he was a postdoc at the University of Washington. Jazayeri is also an investigator at the McGovern Institute for Brain Research.

Yen-Jie Lee is an experimental particle physicist in the field of proton-proton and heavy-ion physics. Utilizing the Large Hadron Colliders, Lee explores matter in extreme conditions, providing new insight into strong interactions and what might have existed and occurred at the beginning of the universe and in distant star cores. His work on jets and heavy flavor particle production in nuclei collisions improves understanding of the quark-gluon plasma, predicted by quantum chromodynamics (QCD) calculations, and the structure of heavy nuclei. He also pioneered studies of high-density QCD with electron-position annihilation data. Lee joined the Department of Physics in 2013 after a fellowship at CERN and postdoc research at the Laboratory for Nuclear Science at MIT. His bachelor’s and master’s degrees were awarded by the National Taiwan University in 2002 and 2004, respectively, and his doctoral degree by MIT in 2011. Lee is a member of the Laboratory for Nuclear Science.

Josh McDermott investigates the sense of hearing. His research addresses both human and machine audition using tools from experimental psychology, engineering, and neuroscience. McDermott hopes to better understand the neural computation underlying human hearing, to improve devices to assist hearing impaired, and to enhance machine interpretation of sounds. Prior to joining MIT’s Department of Brain and Cognitive Sciences, he was awarded a BA in 1998 in brain and cognitive sciences by Harvard University, a master’s degree in computational neuroscience in 2000 by University College London, and a PhD in brain and cognitive sciences in 2006 by MIT. Between his doctoral time at MIT and returning as a faculty member, he was a postdoc at the University of Minnesota and New York University, and a visiting scientist at Oxford University. McDermott is also an associate investigator at the McGovern Institute for Brain Research and an investigator in the Center for Brains, Minds and Machines.

Solving environmental challenges by studying and manipulating chemical reactions is the focus of Yogesh Surendranath’s research. Using chemistry, he works at the molecular level to understand how to efficiently interconvert chemical and electrical energy. His fundamental studies aim to improve energy storage technologies, such as batteries, fuel cells, and electrolyzers, that can be used to meet future energy demand with reduced carbon emissions. Surendranath joined the Department of Chemistry in 2013 after a postdoc at the University of California at Berkeley. His PhD was completed in 2011 at MIT, and BS in 2006 at the University of Virginia. Suendranath is also a collaborator in the MIT Energy Initiative.

A theoretical astrophysicist, Mark Vogelsberger is interested in large-scale structures of the universe, such as galaxy formation. He combines observational data, theoretical models, and simulations that require high-performance supercomputers to improve and develop detailed models that simulate galaxy diversity, clustering, and their properties, including a plethora of physical effects like magnetic fields, cosmic dust, and thermal conduction. Vogelsberger also uses simulations to generate scenarios involving alternative forms of dark matter. He joined the Department of Physics in 2014 after a postdoc at the Harvard-Smithsonian Center for Astrophysics. Vogelsberger is a 2006 graduate of the University of Mainz undergraduate program in physics, and a 2010 doctoral graduate of the University of Munich and the Max Plank Institute for Astrophysics. He is also a principal investigator in the MIT Kavli Institute for Astrophysics and Space Research.

Adam Willard is a theoretical chemist with research interests that fall across molecular biology, renewable energy, and material science. He uses theory, modeling, and molecular simulation to study the disorder that is inherent to systems over nanometer-length scales. His recent work has highlighted the fundamental and unexpected role that such disorder plays in phenomena such as microscopic energy transport in semiconducting plastics, ion transport in batteries, and protein hydration. Joining the Department of Chemistry in 2013, Willard was formerly a postdoc at Lawrence Berkeley National Laboratory and then the University of Texas at Austin. He holds a PhD in chemistry from the University of California at Berkeley, achieved in 2009, and a BS in chemistry and mathematics from the University of Puget Sound, granted in 2003.

Lindley Winslow seeks to understand the fundamental particles shaped the evolution of our universe. As an experimental particle and nuclear physicist, she develops novel detection technology to search for axion dark matter and a proposed nuclear decay that makes more matter than antimatter. She started her faculty position in the Department of Physics in 2015 following a postdoc at MIT and a subsequent faculty position at the University of California at Los Angeles. Winslow achieved her BA in physics and astronomy in 2001 and PhD in physics in 2008, both at the University of California at Berkeley. She is also a member of the Laboratory for Nuclear Science.

Universal musical harmony

by Sabbi Lall | June 16, 2020May 24, 2023

Many forms of Western music make use of harmony, or the sound created by certain pairs of notes. A longstanding question is why some combinations of notes are perceived as pleasant while others sound jarring to the ear. Are the combinations we favor a universal phenomenon? Or are they specific to Western culture?

Through intrepid research trips to the remote Bolivian rainforest, the McDermott lab at the McGovern Institute has found that aspects of the perception of note combinations may be universal, even though the aesthetic evaluation of note combination as pleasant or unpleasant is culture-specific.

“Our work has suggested some universal features of perception that may shape musical behavior around the world,” says McGovern Associate Investigator Josh McDermott, senior author of the Nature Communications study. “But it also indicates the rich interplay with cultural influences that give rise to the experience of music.”

Remote learning

Questions about the universality of musical perception are difficult to answer, in part because of the challenge in finding people with little exposure to Western music. McDermott, who is also an associate professor in MIT’s Department of Brain and Cognitive Sciences and an investigator in the Center for Brains Minds and Machines, has found a way to address this problem. His lab has performed a series of studies with the participation of an indigenous population, the Tsimane’, who live in relative isolation from Western culture and have had little exposure to Western music. Accessing the Tsimane’ villages is challenging, as they are scattered throughout the rainforest and only reachable during the dry part of the year.

Left to right Josh McDermott (in vehicle), Alex Durango, Sophie Dolan and Malinda McPherson experiencing a travel delay en route to a Tsimane’ village after a heavy rainfall. Photo: Malinda McPherson

“When we enter a village there is always a crowd of curious children to greet us,” says Malinda McPherson, a graduate student in the lab and lead author of the study. “Tsimane’ are friendly and welcoming, and we have visited some villages several times, so now many people recognize us.”

In a study published in 2019, McDermott’s team found evidence that the brain’s ability to detect musical octaves is not universal, but is gained through cultural experience. And in 2016 they published findings suggesting that the preference for consonance over dissonance is culture-specific. In their new study, the team decided to explore whether aspects of the perception of consonance and dissonance might nonetheless be universally present across cultures.

Music lessons

In Western music, harmony is the sound of two or more notes heard simultaneously. Think of the Leonard Cohen song, Hallelujah, where he sings about harmony (“the fourth, the fifth, the minor fall and the major lift”). A combination of two notes is called an interval, and intervals that are perceived to be the most pleasant (or consonant, like the fourth and the fifth, for example) to the Western ear are generally represented by smaller integer ratios.

Intervals that are related by low integer ratios have fascinated scientists for centuries.

“Such intervals are central to Western music, but are also believed to be a common feature of many musical systems around the world,” McPherson explains. “So intervals are a natural target for cross-cultural research, which can help identify aspects of perception that are and aren’t independent of cultural experience.”

Scientists have been drawn to low integer ratios in music in part because they relate to the frequencies in voices and many instruments, known as ‘overtones’. Overtones from sounds like voices form a particular pattern known as the harmonic series. As it happens, the combination of two concurrent notes related by a low integer ratio partially reproduces this pattern. Because the brain presumably evolved to represent natural sounds, such as voices, it has seemed plausible that intervals with low integer ratios might have special perceptual status across cultures.

Since the Tsimane’ do not generally sing or play music together, meaning they have not been trained to hear or sing in harmony, McPherson and her colleagues were presented with a unique opportunity to explore whether there is anything universal about the perception of musical intervals.

Taking notes

In order to probe the perception of musical intervals, McDermott and colleagues took advantage of the fact that ears accustomed to Western musical harmony often have difficulty picking apart two “consonant” notes when they are played at the same time. This auditory confusion is known as “fusion” in the field. By contrast, two “dissonant” notes are easier to hear as separate.

The tendency of “consonant” notes to be heard by Westerners as fused could reflect their common occurrence in Western music. But it could also be driven by the resemblance of low-integer-ratio note combinations to the harmonic series. This similarity of consonant intervals to the acoustic structure of typical natural sounds raises the possibility that the human brain is biologically tuned to “fuse” consonant notes.

Graduate student and lead author, Malinda McPherson, works with a participant and translator in the field. Photo: Malinda McPherson

To explore this question, the team ran identical sets of experiments on two participant groups: US non-musicians residing in the Boston metropolitan area and Tsimane’ residing in villages in the Amazon rain forest. Listeners heard two concurrent notes separated by a particular musical interval (consonant or dissonant), and were asked to judge whether they heard one or two sounds. The experiment was performed with both synthetic and natural sounds.

They found that like the Boston cohort, the Tsimane’ were more likely to mistake two notes as a single sound if they were consonant than if they were dissonant.

“I was surprised by how similar some of the results in Tsimane’ participants were to those in US participants,” says McPherson, “particularly given the striking differences that we consistently see in preferences for musical intervals.”

When it came to whether consonant intervals were more pleasant than dissonant intervals, the results told a very different story. While the US study participants found consonant intervals more pleasant than dissonant intervals, the Tsimane’ showed no preference, implying that our sense of what is pleasant is shaped by culture.

“The fusion results provide an example of a perceptual effect that could influence musical systems, for instance by creating a natural perceptual contrast to exploit,” explains McDermott. “Hopefully our work helps to show how one can conduct rigorous perceptual experiments in the field and learn things that would be hidden if we didn’t consider populations in other parts of the world.”

Differences between deep neural networks and human perception

by Kenneth Blum | December 12, 2019May 24, 2023

When your mother calls your name, you know it’s her voice — no matter the volume, even over a poor cell phone connection. And when you see her face, you know it’s hers — if she is far away, if the lighting is poor, or if you are on a bad FaceTime call. This robustness to variation is a hallmark of human perception. On the other hand, we are susceptible to illusions: We might fail to distinguish between sounds or images that are, in fact, different. Scientists have explained many of these illusions, but we lack a full understanding of the invariances in our auditory and visual systems.

Deep neural networks also have performed speech recognition and image classification tasks with impressive robustness to variations in the auditory or visual stimuli. But are the invariances learned by these models similar to the invariances learned by human perceptual systems? A group of MIT researchers has discovered that they are different. They presented their findings yesterday at the 2019 Conference on Neural Information Processing Systems.

The researchers made a novel generalization of a classical concept: “metamers” — physically distinct stimuli that generate the same perceptual effect. The most famous examples of metamer stimuli arise because most people have three different types of cones in their retinae, which are responsible for color vision. The perceived color of any single wavelength of light can be matched exactly by a particular combination of three lights of different colors — for example, red, green, and blue lights. Nineteenth-century scientists inferred from this observation that humans have three different types of bright-light detectors in our eyes. This is the basis for electronic color displays on all of the screens we stare at every day. Another example in the visual system is that when we fix our gaze on an object, we may perceive surrounding visual scenes that differ at the periphery as identical. In the auditory domain, something analogous can be observed. For example, the “textural” sound of two swarms of insects might be indistinguishable, despite differing in the acoustic details that compose them, because they have similar aggregate statistical properties. In each case, the metamers provide insight into the mechanisms of perception, and constrain models of the human visual or auditory systems.

In the current work, the researchers randomly chose natural images and sound clips of spoken words from standard databases, and then synthesized sounds and images so that deep neural networks would sort them into the same classes as their natural counterparts. That is, they generated physically distinct stimuli that are classified identically by models, rather than by humans. This is a new way to think about metamers, generalizing the concept to swap the role of computer models for human perceivers. They therefore called these synthesized stimuli “model metamers” of the paired natural stimuli. The researchers then tested whether humans could identify the words and images.

“Participants heard a short segment of speech and had to identify from a list of words which word was in the middle of the clip. For the natural audio this task is easy, but for many of the model metamers humans had a hard time recognizing the sound,” explains first-author Jenelle Feather, a graduate student in the MIT Department of Brain and Cognitive Sciences (BCS) and a member of the Center for Brains, Minds, and Machines (CBMM). That is, humans would not put the synthetic stimuli in the same class as the spoken word “bird” or the image of a bird. In fact, model metamers generated to match the responses of the deepest layers of the model were generally unrecognizable as words or images by human subjects.

Josh McDermott, associate professor in BCS and investigator in CBMM, makes the following case: “The basic logic is that if we have a good model of human perception, say of speech recognition, then if we pick two sounds that the model says are the same and present these two sounds to a human listener, that human should also say that the two sounds are the same. If the human listener instead perceives the stimuli to be different, this is a clear indication that the representations in our model do not match those of human perception.”

Joining Feather and McDermott on the paper are Alex Durango, a post-baccalaureate student, and Ray Gonzalez, a research assistant, both in BCS.

There is another type of failure of deep networks that has received a lot of attention in the media: adversarial examples (see, for example, “Why did my classifier just mistake a turtle for a rifle?“). These are stimuli that appear similar to humans but are misclassified by a model network (by design — they are constructed to be misclassified). They are complementary to the stimuli generated by Feather’s group, which sound or appear different to humans but are designed to be co-classified by the model network. The vulnerabilities of model networks exposed to adversarial attacks are well-known — face-recognition software might mistake identities; automated vehicles might not recognize pedestrians.

The importance of this work lies in improving models of perception beyond deep networks. Although the standard adversarial examples indicate differences between deep networks and human perceptual systems, the new stimuli generated by the McDermott group arguably represent a more fundamental model failure — they show that generic examples of stimuli classified as the same by a deep network produce wildly different percepts for humans.

The team also figured out ways to modify the model networks to yield metamers that were more plausible sounds and images to humans. As McDermott says, “This gives us hope that we may be able to eventually develop models that pass the metamer test and better capture human invariances.”

“Model metamers demonstrate a significant failure of present-day neural networks to match the invariances in the human visual and auditory systems,” says Feather, “We hope that this work will provide a useful behavioral measuring stick to improve model representations and create better models of human sensory systems.”

Brain science in the Bolivian rainforest

by Julie Pryor | December 10, 2019May 24, 2023

Graduate student Malinda McPherson. Photo: Caitlin Cunningham

Malinda McPherson is a graduate student in Josh McDermott‘s lab, studying how people hear pitch (how high or low a sound is) in both speech and music.

To test the extent to which human audition varies across cultures, McPherson travels with the McDermott lab to Bolivia to study the Tsimane’ — a native Amazonian society with minimal exposure to Western culture.

Their most recent study, published in the journal Current Biology, found a striking variation in perception of musical pitch across cultures.

In this Q&A, we ask McPherson what motivates her research and to describe some of the challenges she has experienced working in the Bolivian rainforest.

What are you working on now?

Right now, I’m particularly excited about a project that involves working with children; we are trying to better understand how the ability to hear pitch develops with age and experience. Difficulty hearing pitch is one of the first issues that most people with poor or corrected hearing find discouraging, so in addition to simply being an interesting basic component of audition, understanding how pitch perception develops may be useful in engineering assistive hearing devices.

How has your personal background inspired your research?

I’ve been an avid violist for over twenty years and still perform with the Chamber Music Society at MIT. When I was an undergraduate and deciding between a career as a professional musician and a career in science, I found a way to merge the two by working as a research assistant in a lab studying musical creativity. I worked in that lab for three years and was completely hooked. My musical training has definitely helped me design a few experiments!

What was your most challenging experience in Bolivia? Most rewarding?

The most challenging aspect of our fieldwork in Bolivia is sustaining our intensity over a period of 4-5 weeks. Every moment is precious, and the pace of work is both exhilarating and exhausting. Despite the long hours of work and travel (by canoe or by truck over very bumpy roads), it is an incredible privilege to meet with and to learn from the Tsimane’. I’ve been picking up some Tsimane’ phrases from the translators with whom we work, and can now have basic conversations with participants and make kids laugh, so that’s a lot of fun. A few children I met my first year greeted me by name when we went back this past year. That was a very special moment!

Translator Manuel Roca Moye (left) with Malinda McPherson and Josh McDermott in a fully loaded canoe. Photo: McDermott lab

What single scientific question do you hope to answer?

I’d be curious to figure out the overlaps and distinctions between how we perceive music versus speech, but I think one of the best aspects of science is that many of the important future questions haven’t been thought of yet!

Perception of musical pitch varies across cultures

Anne Trafton | September 19, 2019May 24, 2023

Press Mentions

Pitch perfect? How culture shapes the way you hear music

Christian Science Monitor

Remote Bolivian tribe does not hear pitch the same way we do

Forbes

People who are accustomed to listening to Western music, which is based on a system of notes organized in octaves, can usually perceive the similarity between notes that are same but played in different registers — say, high C and middle C. However, a longstanding question is whether this a universal phenomenon or one that has been ingrained by musical exposure.

This question has been hard to answer, in part because of the difficulty in finding people who have not been exposed to Western music. Now, a new study led by researchers from MIT and the Max Planck Institute for Empirical Aesthetics has found that unlike residents of the United States, people living in a remote area of the Bolivian rainforest usually do not perceive the similarities between two versions of the same note played at different registers (high or low).

“We’re finding that … there seems to be really striking variation in things that a lot of people would have presumed would be common across cultures and listeners,” says McDermott.

The findings suggest that although there is a natural mathematical relationship between the frequencies of every “C,” no matter what octave it’s played in, the brain only becomes attuned to those similarities after hearing music based on octaves, says Josh McDermott, an associate professor in MIT’s Department of Brain and Cognitive Sciences.

“It may well be that there is a biological predisposition to favor octave relationships, but it doesn’t seem to be realized unless you are exposed to music in an octave-based system,” says McDermott, who is also a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds and Machines.

The study also found that members of the Bolivian tribe, known as the Tsimane’, and Westerners do have a very similar upper limit on the frequency of notes that they can accurately distinguish, suggesting that that aspect of pitch perception may be independent of musical experience and biologically determined.

McDermott is the senior author of the study, which appears in the journal Current Biology on Sept. 19. Nori Jacoby, a former MIT postdoc who is now a group leader at the Max Planck Institute for Empirical Aesthetics, is the paper’s lead author. Other authors are Eduardo Undurraga, an assistant professor at the Pontifical Catholic University of Chile; Malinda McPherson, a graduate student in the Harvard/MIT Program in Speech and Hearing Bioscience and Technology; Joaquin Valdes, a graduate student at the Pontifical Catholic University of Chile; and Tomas Ossandon, an assistant professor at the Pontifical Catholic University of Chile.

Octaves apart

Cross-cultural studies of how music is perceived can shed light on the interplay between biological constraints and cultural influences that shape human perception. McDermott’s lab has performed several such studies with the participation of Tsimane’ tribe members, who live in relative isolation from Western culture and have had little exposure to Western music.

In a study published in 2016, McDermott and his colleagues found that Westerners and Tsimane’ had different aesthetic reactions to chords, or combinations of notes. To Western ears, the combination of C and F# is very grating, but Tsimane’ listeners rated this chord just as likeable as other chords that Westerners would interpret as more pleasant, such as C and G.

Later, Jacoby and McDermott found that both Westerners and Tsimane’ are drawn to musical rhythms composed of simple integer ratios, but the ratios they favor are different, based on which rhythms are more common in the music they listen to.

In their new study, the researchers studied pitch perception using an experimental design in which they play a very simple tune, only two or three notes, and then ask the listener to sing it back. The notes that were played could come from any octave within the range of human hearing, but listeners sang their responses within their vocal range, usually restricted to a single octave.

Eduardo Undurraga, an assistant professor at the Pontifical Catholic University of Chile, runs a musical pitch perception experiment with a member of the Tsimane’ tribe of the Bolivian rainforest. Photo: Josh McDermott

Western listeners, especially those who were trained musicians, tended to reproduce the tune an exact number of octaves above or below what they heard, though they were not specifically instructed to do so. In Western music, the pitch of the same note doubles with each ascending octave, so tones with frequencies of 27.5 hertz, 55 hertz, 110 hertz, 220 hertz, and so on, are all heard as the note A.

Western listeners in the study, all of whom lived in New York or Boston, accurately reproduced sequences such as A-C-A, but in a different register, as though they hear the similarity of notes separated by octaves. However, the Tsimane’ did not.

“The relative pitch was preserved (between notes in the series), but the absolute pitch produced by the Tsimane’ didn’t have any relationship to the absolute pitch of the stimulus,” Jacoby says. “That’s consistent with the idea that perceptual similarity is something that we acquire from exposure to Western music, where the octave is structurally very important.”

The ability to reproduce the same note in different octaves may be honed by singing along with others whose natural registers are different, or singing along with an instrument being played in a different pitch range, Jacoby says.

Limits of perception

The study findings also shed light on the upper limits of pitch perception for humans. It has been known for a long time that Western listeners cannot accurately distinguish pitches above about 4,000 hertz, although they can still hear frequencies up to nearly 20,000 hertz. In a traditional 88-key piano, the highest note is about 4,100 hertz.

People have speculated that the piano was designed to go only that high because of a fundamental limit on pitch perception, but McDermott thought it could be possible that the opposite was true: That is, the limit was culturally influenced by the fact that few musical instruments produce frequencies higher than 4,000 hertz.

The researchers found that although Tsimane’ musical instruments usually have upper limits much lower than 4,000 hertz, Tsimane’ listeners could distinguish pitches very well up to about 4,000 hertz, as evidenced by accurate sung reproductions of those pitch intervals. Above that threshold, their perceptions broke down, very similarly to Western listeners.

“It looks almost exactly the same across groups, so we have some evidence for biological constraints on the limits of pitch,” Jacoby says.

One possible explanation for this limit is that once frequencies reach about 4,000 hertz, the firing rates of the neurons of our inner ear can’t keep up and we lose a critical cue with which to distinguish different frequencies.

“The new study contributes to the age-long debate about the interplays between culture and biological constraints in music,” says Daniel Pressnitzer, a senior research scientist at Paris Descartes University, who was not involved in the research. “This unique, precious, and extensive dataset demonstrates both striking similarities and unexpected differences in how Tsimane’ and Western listeners perceive or conceive musical pitch.”

Jacoby and McDermott now hope to expand their cross-cultural studies to other groups who have had little exposure to Western music, and to perform more detailed studies of pitch perception among the Tsimane’.

Such studies have already shown the value of including research participants other than the Western-educated, relatively wealthy college undergraduates who are the subjects of most academic studies on perception, McDermott says. These broader studies allow researchers to tease out different elements of perception that cannot be seen when examining only a single, homogenous group.

“We’re finding that there are some cross-cultural similarities, but there also seems to be really striking variation in things that a lot of people would have presumed would be common across cultures and listeners,” McDermott says. “These differences in experience can lead to dissociations of different aspects of perception, giving you clues to what the parts of the perceptual system are.”

The research was funded by the James S. McDonnell Foundation, the National Institutes of Health, and the Presidential Scholar in Society and Neuroscience Program at Columbia University.

Hearing through the clatter

by Sabbi Lall | September 2, 2019May 24, 2023

In a busy coffee shop, our eardrums are inundated with sound waves – people chatting, the clatter of cups, music playing – yet our brains somehow manage to untangle relevant sounds, like a barista announcing that our “coffee is ready,” from insignificant noise. A new McGovern Institute study sheds light on how the brain accomplishes the task of extracting meaningful sounds from background noise – findings that could one day help to build artificial hearing systems and aid development of targeted hearing prosthetics.

“These findings reveal a neural correlate of our ability to listen in noise, and at the same time demonstrate functional differentiation between different stages of auditory processing in the cortex,” explains Josh McDermott, an associate professor of brain and cognitive sciences at MIT, a member of the McGovern Institute and the Center for Brains, Minds and Machines, and the senior author of the study.

The auditory cortex, a part of the brain that responds to sound, has long been known to have distinct anatomical subregions, but the role these areas play in auditory processing has remained a mystery. In their study published today in Nature Communications, McDermott and former graduate student Alex Kell, discovered that these subregions respond differently to the presence of background noise, suggesting that auditory processing occurs in steps that progressively hone in on and isolate a sound of interest.

Background check

Previous studies have shown that the primary and non-primary subregions of the auditory cortex respond to sound with different dynamics, but these studies were largely based on brain activity in response to speech or simple synthetic sounds (such as tones and clicks). Little was known about how these regions might work to subserve everyday auditory behavior.

To test these subregions under more realistic conditions, McDermott and Kell, who is now a postdoctoral researcher at Columbia University, assessed changes in human brain activity while subjects listened to natural sounds with and without background noise.

While lying in an MRI scanner, subjects listened to 30 different natural sounds, ranging from meowing cats to ringing phones, that were presented alone or embedded in real-world background noise such as heavy rain.

“When I started studying audition,” explains Kell, “I started just sitting around in my day-to-day life, just listening, and was astonished at the constant background noise that seemed to usually be filtered out by default. Most of these noises tended to be pretty stable over time, suggesting we could experimentally separate them. The project flowed from there.”

To their surprise, Kell and McDermott found that the primary and non-primary regions of the auditory cortex responded differently to natural sound depending upon whether background noise was present.

brain regions responding to sound — Primary auditory cortex (outlined in white) responses change (blue) when background noise is present, whereas non-primary activity is robust to background noise (yellow). Image: Alex Kell

They found that activity of the primary auditory cortex was altered when background noise is present, suggesting that this region has not yet differentiated between meaningful sounds and background noise. Non-primary regions, however, respond similarly to natural sounds irrespective of whether noise is present, suggesting that cortical signals generated by sound are transformed or “cleaned up” to remove background noise by the time they reach the non-primary auditory cortex.

“We were surprised by how big the difference was between primary and non-primary areas,” explained Kell, “so we ran a bunch more subjects but kept seeing the same thing. We had a ton of questions about what might be responsible for this difference, and that’s why we ended up running all these follow-up experiments.”

A general principle

Kell and McDermott went on to test whether these responses were specific to particular sounds, and discovered that the above effect remained stable no matter the source or type of sound activity. Music, speech, or a squeaky toy, all activated the non-primary cortex region similarly, whether or not background noise was present.

The authors also tested whether attention is relevant. Even when the researchers sneakily distracted subjects with a visual task in the scanner, the cortical subregions responded to meaningful sound and background noise in the same way, showing that attention is not driving this aspect of sound processing. In other words, even when we are focused on reading a book, our brain is diligently sorting the sound of our meowing cat from the patter of heavy rain outside.

Future directions

The McDermott lab is now building computational models of the so-called “noise robustness” found in the Nature Communications study and Kell is pursuing a finer-grained understanding of sound processing in his postdoctoral work at Columbia, by exploring the neural circuit mechanisms underlying this phenomenon.

By gaining a deeper understanding of how the brain processes sound, the researchers hope their work will contribute to improve diagnoses and treatment of hearing dysfunction. Such research could help to reveal the origins of listening difficulties that accompany developmental disorders or age-related hearing loss. For instance, if hearing loss results from dysfunction in sensory processing, this could be caused by abnormal noise robustness in the auditory cortex. Normal noise robustness might instead suggest that there are impairments elsewhere in the brain, for example a break down in higher executive function.

“In the future,” McDermott says, “we hope these noninvasive measures of auditory function may become valuable tools for clinical assessment.”

Our brains appear uniquely tuned for musical pitch

by National Institutes of Health | June 11, 2019May 24, 2023

In the eternal search for understanding what makes us human, scientists found that our brains are more sensitive to pitch, the harmonic sounds we hear when listening to music, than our evolutionary relative the macaque monkey. The study, funded in part by the National Institutes of Health, highlights the promise of Sound Health, a joint project between the NIH and the John F. Kennedy Center for the Performing Arts, in association with the National Endowment for the Arts, that aims to understand the role of music in health.

“We found that a certain region of our brains has a stronger preference for sounds with pitch than macaque monkey brains,” said Bevil Conway, Ph.D., investigator in the NIH’s Intramural Research Program and a senior author of the study published in Nature Neuroscience. “The results raise the possibility that these sounds, which are embedded in speech and music, may have shaped the basic organization of the human brain.”

The study started with a friendly bet between Dr. Conway and Sam Norman-Haignere, Ph.D., a post-doctoral fellow at Columbia University’s Zuckerman Institute for Mind, Brain, and Behavior and the first author of the paper.

At the time, both were working at the Massachusetts Institute of Technology (MIT). Dr. Conway’s team had been searching for differences between how human and monkey brains control vision only to discover that there are very few. Their brain mapping studies suggested that humans and monkeys see the world in very similar ways. But then, Dr. Conway heard about some studies on hearing being done by Dr. Norman-Haignere, who, at the time, was a post-doctoral fellow in the laboratory of Josh H. McDermott, Ph.D., associate professor at MIT.

“I told Bevil that we had a method for reliably identifying a region in the human brain that selectively responds to sounds with pitch,” said Dr. Norman-Haignere, That is when they got the idea to compare humans with monkeys. Based on his studies, Dr. Conway bet that they would see no differences.

To test this, the researchers played a series of harmonic sounds, or tones, to healthy volunteers and monkeys. Meanwhile, functional magnetic resonance imaging (fMRI) was used to monitor brain activity in response to the sounds. The researchers also monitored brain activity in response to sounds of toneless noises that were designed to match the frequency levels of each tone played.

At first glance, the scans looked similar and confirmed previous studies. Maps of the auditory cortex of human and monkey brains had similar hot spots of activity regardless of whether the sounds contained tones.

However, when the researchers looked more closely at the data, they found evidence suggesting the human brain was highly sensitive to tones. The human auditory cortex was much more responsive than the monkey cortex when they looked at the relative activity between tones and equivalent noisy sounds.

“We found that human and monkey brains had very similar responses to sounds in any given frequency range. It’s when we added tonal structure to the sounds that some of these same regions of the human brain became more responsive,” said Dr. Conway. “These results suggest the macaque monkey may experience music and other sounds differently. In contrast, the macaque’s experience of the visual world is probably very similar to our own. It makes one wonder what kind of sounds our evolutionary ancestors experienced.”

Further experiments supported these results. Slightly raising the volume of the tonal sounds had little effect on the tone sensitivity observed in the brains of two monkeys.

Finally, the researchers saw similar results when they used sounds that contained more natural harmonies for monkeys by playing recordings of macaque calls. Brain scans showed that the human auditory cortex was much more responsive than the monkey cortex when they compared relative activity between the calls and toneless, noisy versions of the calls.

“This finding suggests that speech and music may have fundamentally changed the way our brain processes pitch,” said Dr. Conway. “It may also help explain why it has been so hard for scientists to train monkeys to perform auditory tasks that humans find relatively effortless.”

Earlier this year, other scientists from around the U.S. applied for the first round of NIH Sound Health research grants. Some of these grants may eventually support scientists who plan to explore how music turns on the circuitry of the auditory cortex that make our brains sensitive to musical pitch.

This study was supported by the NINDS, NEI, NIMH, and NIA Intramural Research Programs and grants from the NIH (EY13455; EY023322; EB015896; RR021110), the National Science Foundation (Grant 1353571; CCF-1231216), the McDonnell Foundation, the Howard Hughes Medical Institute.

Josh McDermott

The Science of Hearing

Hearing enables us to make sense of our whereabouts, understand the emotional state of others, and enjoy musical experiences. Acoustic information relays vital cues about the world—yet much of the sophisticated brain system that decodes this information is poorly understood.

Josh McDermott’s research is in search of foundational principles of sound perception. Groundbreaking discoveries from the McDermott lab have clarified how people hear and process sounds. His research informs new treatments for those with hearing loss, and paves the way for machine systems that emulate the human ability to recognize and interpret sound. McDermott’s lab has also pioneered new approaches for understanding music perception. His lab deconstructs the neural ensembles that allow us to appreciate music, while also studying the often striking variation that can occur across cultures.

Biography

Josh McDermott started college intending to study physics and math, but was soon seduced by the mysteries of the brain – in particular, its stunning ability to solve ill-posed perceptual problems. His early training was in vision, but after earning a BA in brain and cognitive sciences at Harvard University, he moved to London to study at the newly formed Gatsby Unit, where he completed an MPhil in Computational Neuroscience.

McDermott returned to the US for a PhD in brain and cognitive sciences at MIT, where he became interested in sound and hearing. He eventually transitioned into auditory research, with postdoctoral training in psychoacoustics at the University of Minnesota and in computational neuroscience at NYU.

In 2013, he joined the Department of Brain and Cognitive Sciences at MIT as an assistant professor and now serves as associate department head. McDermott joined the McGovern Institute as an associate investigator in 2018. He is the recipient of a Marshall Scholarship, a James S. McDonnell Foundation Scholar Award, and an NSF CAREER Award.

Honors and Awards

Awards

2023 – Award for Diversity, Equity, Inclusion and Justice, Department of Brain and Cognitive Sciences, MIT
2018 – Excellence in Undergraduate Advising, Department of Brain and Cognitive Sciences, MIT
2018 – Troland Research Award, National Academy of Sciences
2017 – APAN Young Investigator Award
2015 – CAREER Award, National Science Foundation
2015 – Fred & Carole Middleton Career Development Professorship, MIT
2014 – Excellence in Undergraduate Advising, Department of Brain and Cognitive Sciences, MIT
2012 – Scholar Award in Understanding Human Cognition, James S. McDonnell Foundation

Virtual Tour of McDermott Lab

Yanny or Laurel?

Anne Trafton | May 17, 2018May 24, 2023

“Yanny” or “Laurel?” Discussion around this auditory version of “The Dress” has divided the internet this week.

In this video, brain and cognitive science PhD students Dana Boebinger and Kevin Sitek, both members of the McGovern Institute, unpack the science — and settle the debate. The upshot? Our brain is faced with a myriad of sensory cues that it must process and make sense of simultaneously. Hearing is no exception, and two brains can sometimes “translate” soundwaves in very different ways.

Faculty

Principal Research Scientists

Focus Areas

Disorder Areas

Researcher: Josh McDermott