A visual pathway in the brain may do more than recognize objects

When visual information enters the brain, it travels through two pathways that process different aspects of the input. For decades, scientists have hypothesized that one of these pathways, the ventral visual stream, is responsible for recognizing objects, and that it might have been optimized by evolution to do just that.

Consistent with this, in the past decade, MIT scientists have found that when computational models of the anatomy of the ventral stream are optimized to solve the task of object recognition, they are remarkably good predictors of the neural activities in the ventral stream.

However, in a new study, MIT researchers have shown that when they train these types of models on spatial tasks instead, the resulting models are also quite good predictors of the ventral stream’s neural activities. This suggests that the ventral stream may not be exclusively optimized for object recognition.

“This leaves wide open the question about what the ventral stream is being optimized for. I think the dominant perspective a lot of people in our field believe is that the ventral stream is optimized for object recognition, but this study provides a new perspective that the ventral stream could be optimized for spatial tasks as well,” says MIT graduate student Yudi Xie.

Xie is the lead author of the study, which will be presented at the International Conference on Learning Representations. Other authors of the paper include Weichen Huang, a visiting student through MIT’s Research Science Institute program; Esther Alter, a software engineer at the MIT Quest for Intelligence; Jeremy Schwartz, a sponsored research technical staff member; Joshua Tenenbaum, a professor of brain and cognitive sciences; and James DiCarlo, the Peter de Florez Professor of Brain and Cognitive Sciences, director of the Quest for Intelligence, and a member of the McGovern Institute for Brain Research at MIT.

Beyond object recognition

When we look at an object, our visual system can not only identify the object, but also determine other features such as its location, its distance from us, and its orientation in space. Since the early 1980s, neuroscientists have hypothesized that the primate visual system is divided into two pathways: the ventral stream, which performs object-recognition tasks, and the dorsal stream, which processes features related to spatial location.

Over the past decade, researchers have worked to model the ventral stream using a type of deep-learning model known as a convolutional neural network (CNN). Researchers can train these models to perform object-recognition tasks by feeding them datasets containing thousands of images along with category labels describing the images.

The state-of-the-art versions of these CNNs have high success rates at categorizing images. Additionally, researchers have found that the internal activations of the models are very similar to the activities of neurons that process visual information in the ventral stream. Furthermore, the more similar these models are to the ventral stream, the better they perform at object-recognition tasks. This has led many researchers to hypothesize that the dominant function of the ventral stream is recognizing objects.

However, experimental studies, especially a study from the DiCarlo lab in 2016, have found that the ventral stream appears to encode spatial features as well. These features include the object’s size, its orientation (how much it is rotated), and its location within the field of view. Based on these studies, the MIT team aimed to investigate whether the ventral stream might serve additional functions beyond object recognition.

“Our central question in this project was, is it possible that we can think about the ventral stream as being optimized for doing these spatial tasks instead of just categorization tasks?” Xie says.

To test this hypothesis, the researchers set out to train a CNN to identify one or more spatial features of an object, including rotation, location, and distance. To train the models, they created a new dataset of synthetic images. These images show objects such as tea kettles or calculators superimposed on different backgrounds, in locations and orientations that are labeled to help the model learn them.

The researchers found that CNNs that were trained on just one of these spatial tasks showed a high level of “neuro-alignment” with the ventral stream — very similar to the levels seen in CNN models trained on object recognition.

The researchers measure neuro-alignment using a technique that DiCarlo’s lab has developed, which involves asking the models, once trained, to predict the neural activity that a particular image would generate in the brain. The researchers found that the better the models performed on the spatial task they had been trained on, the more neuro-alignment they showed.

“I think we cannot assume that the ventral stream is just doing object categorization, because many of these other functions, such as spatial tasks, also can lead to this strong correlation between models’ neuro-alignment and their performance,” Xie says. “Our conclusion is that you can optimize either through categorization or doing these spatial tasks, and they both give you a ventral-stream-like model, based on our current metrics to evaluate neuro-alignment.”

Comparing models

The researchers then investigated why these two approaches — training for object recognition and training for spatial features — led to similar degrees of neuro-alignment. To do that, they performed an analysis known as centered kernel alignment (CKA), which allows them to measure the degree of similarity between representations in different CNNs. This analysis showed that in the early to middle layers of the models, the representations that the models learn are nearly indistinguishable.

“In these early layers, essentially you cannot tell these models apart by just looking at their representations,” Xie says. “It seems like they learn some very similar or unified representation in the early to middle layers, and in the later stages they diverge to support different tasks.”

The researchers hypothesize that even when models are trained to analyze just one feature, they also take into account “non-target” features — those that they are not trained on. When objects have greater variability in non-target features, the models tend to learn representations more similar to those learned by models trained on other tasks. This suggests that the models are using all of the information available to them, which may result in different models coming up with similar representations, the researchers say.

“More non-target variability actually helps the model learn a better representation, instead of learning a representation that’s ignorant of them,” Xie says. “It’s possible that the models, although they’re trained on one target, are simultaneously learning other things due to the variability of these non-target features.”

In future work, the researchers hope to develop new ways to compare different models, in hopes of learning more about how each one develops internal representations of objects based on differences in training tasks and training data.

“There could be still slight differences between these models, even though our current way of measuring how similar these models are to the brain tells us they’re on a very similar level. That suggests maybe there’s still some work to be done to improve upon how we can compare the model to the brain, so that we can better understand what exactly the ventral stream is optimized for,” Xie says.

The research was funded by the Semiconductor Research Corporation and the U.S. Defense Advanced Research Projects Agency.

Looking under the hood at the brain’s language system

As a young girl growing up in the former Soviet Union, Evelina Fedorenko PhD ’07 studied several languages, including English, as her mother hoped that it would give her the chance to eventually move abroad for better opportunities.

Her language studies not only helped her establish a new life in the United States as an adult, but also led to a lifelong interest in linguistics and how the brain processes language. Now an associate professor of brain and cognitive sciences at MIT, Fedorenko studies the brain’s language-processing regions: how they arise, whether they are shared with other mental functions, and how each region contributes to language comprehension and production.

Fedorenko’s early work helped to identify the precise locations of the brain’s language-processing regions, and she has been building on that work to generate insight into how different neuronal populations in those regions implement linguistic computations.

“It took a while to develop the approach and figure out how to quickly and reliably find these regions in individual brains, given this standard problem of the brain being a little different across people,” she says. “Then we just kept going, asking questions like: Does language overlap with other functions that are similar to it? How is the system organized internally? Do different parts of this network do different things? There are dozens and dozens of questions you can ask, and many directions that we have pushed on.”

Among some of the more recent directions, she is exploring how the brain’s language-processing regions develop early in life, through studies of very young children, people with unusual brain architecture, and computational models known as large language models.

From Russia to MIT

Fedorenko grew up in the Russian city of Volgograd, which was then part of the Soviet Union. When the Soviet Union broke up in 1991, her mother, a mechanical engineer, lost her job, and the family struggled to make ends meet.

“It was a really intense and painful time,” Fedorenko recalls. “But one thing that was always very stable for me is that I always had a lot of love, from my parents, my grandparents, and my aunt and uncle. That was really important and gave me the confidence that if I worked hard and had a goal, that I could achieve whatever I dreamed about.”

Fedorenko did work hard in school, studying English, French, German, Polish, and Spanish, and she also participated in math competitions. As a 15-year-old, she spent a year attending high school in Alabama, as part of a program that placed students from the former Soviet Union with American families. She had been thinking about applying to universities in Europe but changed her plans when she realized the American higher education system offered more academic flexibility.

After being admitted to Harvard University with a full scholarship, she returned to the United States in 1998 and earned her bachelor’s degree in psychology and linguistics, while also working multiple jobs to send money home to help her family.

While at Harvard, she also took classes at MIT and ended up deciding to apply to the Institute for graduate school. For her PhD research at MIT, she worked with Ted Gibson, a professor of brain and cognitive sciences, and later, Nancy Kanwisher, the Walter A. Rosenblith Professor of Cognitive Neuroscience. She began by using functional magnetic resonance imaging (fMRI) to study brain regions that appeared to respond preferentially to music, but she soon switched to studying brain responses to language.

She found that working with Kanwisher, who studies the functional organization of the human brain but hadn’t worked much on language before, helped Fedorenko to build a research program free of potential biases baked into some of the early work on language processing in the brain.

“We really kind of started from scratch,” Fedorenko says, “combining the knowledge of language processing I have gained by working with Gibson and the rigorous neuroscience approaches that Kanwisher had developed when studying the visual system.”

After finishing her PhD in 2007, Fedorenko stayed at MIT for a few years as a postdoc funded by the National Institutes of Health, continuing her research with Kanwisher. During that time, she and Kanwisher developed techniques to identify language-processing regions in different people, and discovered new evidence that certain parts of the brain respond selectively to language. Fedorenko then spent five years as a research faculty member at Massachusetts General Hospital, before receiving an offer to join the faculty at MIT in 2019.

How the brain processes language

Since starting her lab at MIT’s McGovern Institute for Brain Research, Fedorenko and her trainees have made several discoveries that have helped to refine neuroscientists’ understanding of the brain’s language-processing regions, which are spread across the left frontal and temporal lobes of the brain.

In a series of studies, her lab showed that these regions are highly selective for language and are not engaged by activities such as listening to music, reading computer code, or interpreting facial expressions, all of which have been argued to be share similarities with language processing.

“We’ve separated the language-processing machinery from various other systems, including the system for general fluid thinking, and the systems for social perception and reasoning, which support the processing of communicative signals, like facial expressions and gestures, and reasoning about others’ beliefs and desires,” Fedorenko says. “So that was a significant finding, that this system really is its own thing.”

More recently, Fedorenko has turned her attention to figuring out, in more detail, the functions of different parts of the language processing network. In one recent study, she identified distinct neuronal populations within these regions that appear to have different temporal windows for processing linguistic content, ranging from just one word up to six words.

She is also studying how language-processing circuits arise in the brain, with ongoing studies in which she and a postdoc in her lab are using fMRI to scan the brains of young children, observing how their language regions behave even before the children have fully learned to speak and understand language.

Large language models (similar to ChatGPT) can help with these types of developmental questions, as the researchers can better control the language inputs to the model and have continuous access to its abilities and representations at different stages of learning.

“You can train models in different ways, on different kinds of language, in different kind of regimens. For example, training on simpler language first and then more complex language, or on language combined with some visual inputs. Then you can look at the performance of these language models on different tasks, and also examine changes in their internal representations across the training trajectory, to test which model best captures the trajectory of human language learning,” Fedorenko says.

To gain another window into how the brain develops language ability, Fedorenko launched the Interesting Brains Project several years ago. Through this project, she is studying people who experienced some type of brain damage early in life, such as a prenatal stroke, or brain deformation as a result of a congenital cyst. In some of these individuals, their conditions destroyed or significantly deformed the brain’s typical language-processing areas, but all of these individuals are cognitively indistinguishable from individuals with typical brains: They still learned to speak and understand language normally, and in some cases, they didn’t even realize that their brains were in some way atypical until they were adults.

“That study is all about plasticity and redundancy in the brain, trying to figure out what brains can cope with, and how” Fedorenko says. “Are there many solutions to build a human mind, even when the neural infrastructure is so different-looking?”

To the brain, Esperanto and Klingon appear the same as English or Mandarin

Within the human brain, a network of regions has evolved to process language. These regions are consistently activated whenever people listen to their native language or any language in which they are proficient.

A new study by MIT researchers finds that this network also responds to languages that are completely invented, such as Esperanto, which was created in the late 1800s as a way to promote international communication, and even to languages made up for television shows such as “Star Trek” and “Game of Thrones.”

To study how the brain responds to these artificial languages, MIT neuroscientists convened nearly 50 speakers of these languages over a single weekend. Using functional magnetic resonance imaging (fMRI), the researchers found that when participants listened to a constructed language in which they were proficient, the same brain regions lit up as those activated when they processed their native language.

“We find that constructed languages very much recruit the same system as natural languages, which suggests that the key feature that is necessary to engage the system may have to do with the kinds of meanings that both kinds of languages can express,” says Evelina Fedorenko, an associate professor of neuroscience at MIT, a member of MIT’s McGovern Institute for Brain Research and the senior author of the study.

The findings help to define some of the key properties of language, the researchers say, and suggest that it’s not necessary for languages to have naturally evolved over a long period of time or to have a large number of speakers.

“It helps us narrow down this question of what a language is, and do it empirically, by testing how our brain responds to stimuli that might or might not be language-like,” says Saima Malik-Moraleda, an MIT postdoc and the lead author of the paper, which appears this week in the Proceedings of the National Academy of Sciences.

Convening the conlang community

Unlike natural languages, which evolve within communities and are shaped over time, constructed languages, or “conlangs,” are typically created by one person who decides what sounds will be used, how to label different concepts, and what the grammatical rules are.

Esperanto, the most widely spoken conlang, was created in 1887 by L.L. Zamenhof, who intended it to be used as a universal language for international communication. Currently, it is estimated that around 60,000 people worldwide are proficient in Esperanto.

In previous work, Fedorenko and her students have found that computer programming languages, such as Python — another type of invented language — do not activate the brain network that is used to process natural language. Instead, people who read computer code rely on the so-called multiple demand network, a brain system that is often recruited for difficult cognitive tasks.

Fedorenko and others have also investigated how the brain responds to other stimuli that share features with language, including music and nonverbal communication such as gestures and facial expressions.

“We spent a lot of time looking at all these various kinds of stimuli, finding again and again that none of them engage the language-processing mechanisms,” Fedorenko says. “So then the question becomes, what is it that natural languages have that none of those other systems do?”

That led the researchers to wonder if artificial languages like Esperanto would be processed more like programming languages or more like natural languages. Similar to programming languages, constructed languages are created by an individual for a specific purpose, without natural evolution within a community. However, unlike programming languages, both conlangs and natural languages can be used to convey meanings about the state of the external world or the speaker’s internal state.

To explore how the brain processes conlangs, the researchers invited speakers of Esperanto and several other constructed languages to MIT for a weekend conference in November 2022. The other languages included Klingon (from “Star Trek”), Na’vi (from “Avatar”), and two languages from “Game of Thrones” (High Valyrian and Dothraki). For all of these languages, there are texts available for people who want to learn the language, and for Esperanto, Klingon, and High Valyrian, there is even a Duolingo app available.

“It was a really fun event where all the communities came to participate, and over a weekend, we collected all the data,” says Malik-Moraleda, who co-led the data collection effort with former MIT postbac Maya Taliaferro, now a PhD student at New York University.

During that event, which also featured talks from several of the conlang creators, the researchers used fMRI to scan 44 conlang speakers as they listened to sentences from the constructed language in which they were proficient. The creators of these languages — who are co-authors on the paper — helped construct the sentences that were presented to the participants.

While in the scanner, the participants also either listened to or read sentences in their native language, and performed some nonlinguistic tasks for comparison. The researchers found that when people listened to a conlang, the same language regions in the brain were activated as when they listened to their native language.

Common features

The findings help to identify some of the key features that are necessary to recruit the brain’s language processing areas, the researchers say. One of the main characteristics driving language responses seems to be the ability to convey meanings about the interior and exterior world — a trait that is shared by natural and constructed languages, but not programming languages.

“All of the languages, both natural and constructed, express meanings related to inner and outer worlds. They refer to objects in the world, to properties of objects, to events,” Fedorenko says. “Whereas programming languages are much more similar to math. A programming language is a symbolic generative system that allows you to express complex meanings, but it’s a self-contained system: The meanings are highly abstract and mostly relational, and not connected to the real world that we experience.”

Some other characteristics of natural languages, which are not shared by constructed languages, don’t seem to be necessary to generate a response in the language network.

“It doesn’t matter whether the language is created and shaped over time by a community of speakers, because these constructed languages are not,” Malik-Moraleda says. “It doesn’t matter how old they are, because conlangs that are just a decade old engage the same brain regions as natural languages that have been around for many hundreds of years.”

To further refine the features of language that activate the brain’s language network, Fedorenko’s lab is now planning to study how the brain responds to a conlang called Lojban, which was created by the Logical Language Group in the 1990s and was designed to prevent ambiguity of meanings and promote more efficient communication.

The research was funded by MIT’s McGovern Institute for Brain Research, Brain and Cognitive Sciences Department, the Simons Center for the Social Brain, the Frederick A. and Carole J. Middleton Career Development Professorship, and the U.S. National Institutes of Health.

How nature organizes itself, from brain cells to ecosystems

McGovern Associate Investigator Ila Fiete. Photo: Caitlin Cunningham

Look around, and you’ll see it everywhere: the way trees form branches, the way cities divide into neighborhoods, the way the brain organizes into regions. Nature loves modularity—a limited number of self-contained units that combine in different ways to perform many functions. But how does this organization arise? Does it follow a detailed genetic blueprint, or can these structures emerge on their own?

A new study from McGovern Associate Investigator Ila Fiete suggests a surprising answer.

In findings published today in Nature, Fiete, a professor of brain and cognitive sciences and director of the K. Lisa Yang Integrative Computational Neuroscience (ICoN) Center at MIT, reports that a mathematical model called peak selection can explain how modules emerge without strict genetic instructions. Her team’s findings, which apply to brain systems and ecosystems, help explain how modularity occurs across nature, no matter the scale.

Joining two big ideas

“Scientists have debated how modular structures form. One hypothesis suggests that various genes are turned on at different locations to begin or end a structure. This explains how insect embryos develop body segments, with genes turning on or off at specific concentrations of a smooth chemical gradient in the insect egg,” says Fiete, who is the senior author of the paper. Mikail Khona, a former graduate student and K. Lisa Yang ICoN Center Graduate Fellow, and postdoctoral associate Sarthak Chandra also led the study.

Another idea, inspired by mathematician Alan Turing, suggests that a structure could emerge from competition—small-scale interactions can create repeating patterns, like the spots on a cheetah or the ripples in sand dunes.

Both ideas work well in some cases, but fail in others. The new research suggests that nature need not pick one approach over the other. The authors propose a simple mathematical principle called peak selection, showing that when a smooth gradient is paired with local interactions that are competitive, modular structures emerge naturally. “In this way, biological systems can organize themselves into sharp modules without detailed top-down instruction,” says Chandra.

Modular systems in the brain

The researchers tested their idea on grid cells, which play a critical role in spatial navigation as well as the storage of episodic memories. Grid cells fire in a repeating triangular pattern as animals move through space, but they don’t all work at the same scale—they are organized into distinct modules, each responsible for mapping space at slightly different resolutions.

A visual depiction of two different modules in grid cells, used to map space at slightly different resolutions. Image: Fiete Lab

No one knows how these modules form, but Fiete’s model shows that gradual variations in cellular properties along one dimension in the brain, combined with local neural interactions, could explain the entire structure. The grid cells naturally sort themselves into distinct groups with clear boundaries, without external maps or genetic programs telling them where to go. “Our work explains how grid cell modules could emerge. The explanation tips the balance toward the possibility of self-organization. It predicts that there might be no gene or intrinsic cell property that jumps when the grid cell scale jumps to another module,” notes Khona.

Modular systems in nature

The same principle applies beyond neuroscience. Imagine a landscape where temperatures and rainfall vary gradually over a space. You might expect species to be spread and also vary smoothly over this region. But in reality, ecosystems often form species clusters with sharp boundaries—distinct ecological “neighborhoods” that don’t overlap.

Fiete’s study suggests why: Local competition, cooperation, and predation between species interact with the global environmental gradients to create natural separations, even when the underlying conditions change gradually. This phenomenon can be explained using peak selection—and suggests that the same principle that shapes brain circuits could also be at play in forests and oceans.

A self-organizing world

One of the researchers’ most striking findings is that modularity in these systems is remarkably robust. Change the size of the system, and the number of modules stays the same, they just scale up or down. That means a mouse brain and a human brain could use the same fundamental rules to form their navigation circuits, just at different sizes.

The model also makes testable predictions. If it’s correct, grid cell modules should follow simple spacing ratios. In ecosystems, species distributions should form distinct clusters even without sharp environmental shifts.

Fiete notes that their work adds another conceptual framework to biology. “Peak selection can inform future experiments, not only in grid cell research but across developmental biology.”

Evelina Fedorenko receives Troland Award from National Academy of Sciences

The National Academy of Sciences (NAS) announced today that McGovern Investigator Evelina Fedorenko will receive a 2025 Troland Research Award for her groundbreaking contributions towards understanding the language network in the human brain.

The Troland Research Award is given annually to recognize unusual achievement by early-career researchers within the broad spectrum of experimental psychology.

Two women and one child looking at a computer screen.
McGovern Investigator Ev Fedorenko (center) looks at a young subject’s brain scan in the Martinos Imaging Center at MIT. Photo: Alexandra Sokhina

Fedorenko, who is an associate professor of brain and cognitive sciences at MIT, is interested in how minds and brains create language. Her lab is unpacking the internal architecture of the brain’s language system and exploring the relationship between language and various cognitive, perceptual, and motor systems.  Her novel methods combine precise measures of an individual’s brain organization with innovative computational modeling to make fundamental discoveries about the computations that underlie the uniquely human ability for language.

Fedorenko has shown that the language network is selective for language processing over diverse non-linguistic processes that have been argued to share computational demands with language, such as math, music, and social reasoning. Her work has also demonstrated that syntactic processing is not localized to a particular region within the language network, and every brain region that responds to syntactic processing is at least as sensitive to word meanings.

She has also shown that representations from neural network language models, such as ChatGPT, are similar to those in the human language brain areas. Fedorenko also highlighted that although language models can master linguistic rules and patterns, they are less effective at using language in real-world situations. In the human brain, that kind of functional competence is distinct from formal language competence, she says, requiring not just language-processing circuits but also brain areas that store knowledge of the world, reason, and interpret social interactions. Contrary to a prominent view that language is essential for thinking, Fedorenko argues that language is not the medium of thought and is primarily a tool for communication.

A probabilistic atlas of the human language network based on >800 individuals (center) and sample individual language networks, which illustrate inter-individual variability in the precise locations and shapes of the language areas. Image: Ev Fedorenko

Ultimately, Fedorenko’s cutting-edge work is uncovering the computations and representations that fuel language processing in the brain. She will receive the Troland Award this April, during the annual meeting of the NAS in Washington DC.

 

 

 

How one brain circuit encodes memories of both places and events

Nearly 50 years ago, neuroscientists discovered cells within the brain’s hippocampus that store memories of specific locations. These cells also play an important role in storing memories of events, known as episodic memories. While the mechanism of how place cells encode spatial memory has been well-characterized, it has remained a puzzle how they encode episodic memories.

A new model developed by MIT researchers explains how those place cells can be recruited to form episodic memories, even when there’s no spatial component. According to this model, place cells, along with grid cells found in the entorhinal cortex, act as a scaffold that can be used to anchor memories as a linked series.

“This model is a first-draft model of the entorhinal-hippocampal episodic memory circuit. It’s a foundation to build on to understand the nature of episodic memory. That’s the thing I’m really excited about,” says Ila Fiete, a professor of brain and cognitive sciences at MIT, a member of MIT’s McGovern Institute for Brain Research, and the senior author of the new study.

The model accurately replicates several features of biological memory systems, including the large storage capacity, gradual degradation of older memories, and the ability of people who compete in memory competitions to store enormous amounts of information in “memory palaces.”

MIT Research Scientist Sarthak Chandra and Sugandha Sharma PhD ’24 are the lead authors of the study, which appears today in Nature. Rishidev Chaudhuri, an assistant professor at the University of California at Davis, is also an author of the paper.

An index of memories

To encode spatial memory, place cells in the hippocampus work closely with grid cells — a special type of neuron that fires at many different locations, arranged geometrically in a regular pattern of repeating triangles. Together, a population of grid cells forms a lattice of triangles representing a physical space.

In addition to helping us recall places where we’ve been, these hippocampal-entorhinal circuits also help us navigate new locations. From human patients, it’s known that these circuits are also critical for forming episodic memories, which might have a spatial component but mainly consist of events, such as how you celebrated your last birthday or what you had for lunch yesterday.

“The same hippocampal and entorhinal circuits are used not just for spatial memory, but also for general episodic memory,” says Fiete, who is also the director of the K. Lisa Yang ICoN Center at MIT. “The question you can ask is what is the connection between spatial and episodic memory that makes them live in the same circuit?”

Two hypotheses have been proposed to account for this overlap in function. One is that the circuit is specialized to store spatial memories because those types of memories — remembering where food was located or where predators were seen — are important to survival. Under this hypothesis, this circuit encodes episodic memories as a byproduct of spatial memory.

An alternative hypothesis suggests that the circuit is specialized to store episodic memories, but also encodes spatial memory because location is one aspect of many episodic memories.

In this work, Fiete and her colleagues proposed a third option: that the peculiar tiling structure of grid cells and their interactions with hippocampus are equally important for both types of memory — episodic and spatial. To develop their new model, they built on computational models that her lab has been developing over the past decade, which mimic how grid cells encode spatial information.

“We reached the point where I felt like we understood on some level the mechanisms of the grid cell circuit, so it felt like the time to try to understand the interactions between the grid cells and the larger circuit that includes the hippocampus,” Fiete says.

In the new model, the researchers hypothesized that grid cells interacting with hippocampal cells can act as a scaffold for storing either spatial or episodic memory. Each activation pattern within the grid defines a “well,” and these wells are spaced out at regular intervals. The wells don’t store the content of a specific memory, but each one acts as a pointer to a specific memory, which is stored in the synapses between the hippocampus and the sensory cortex.

When the memory is triggered later from fragmentary pieces, grid and hippocampal cell interactions drive the circuit state into the nearest well, and the state at the bottom of the well connects to the appropriate part of the sensory cortex to fill in the details of the memory. The sensory cortex is much larger than the hippocampus and can store vast amounts of memory.

“Conceptually, we can think about the hippocampus as a pointer network. It’s like an index that can be pattern-completed from a partial input, and that index then points toward sensory cortex, where those inputs were experienced in the first place,” Fiete says. “The scaffold doesn’t contain the content, it only contains this index of abstract scaffold states.”

Furthermore, events that occur in sequence can be linked together: Each well in the grid cell-hippocampal network efficiently stores the information that is needed to activate the next well, allowing memories to be recalled in the right order.

Modeling memory cliffs and palaces

The researchers’ new model replicates several memory-related phenomena much more accurately than existing models that are based on Hopfield networks — a type of neural network that can store and recall patterns.

While Hopfield networks offer insight into how memories can be formed by strengthening connections between neurons, they don’t perfectly model how biological memory works. In Hopfield models, every memory is recalled in perfect detail until capacity is reached. At that point, no new memories can form, and worse, attempting to add more memories erases all prior ones. This “memory cliff” doesn’t accurately mimic what happens in the biological brain, which tends to gradually forget the details of older memories while new ones are continually added.

The new MIT model captures findings from decades of recordings of grid and hippocampal cells in rodents made as the animals explore and forage in various environments. It also helps to explain the underlying mechanisms for a memorization strategy known as a memory palace. One of the tasks in memory competitions is to memorize the shuffled sequence of cards in one or several card decks. They usually do this by assigning each card to a particular spot in a memory palace — a memory of a childhood home or other environment they know well. When they need to recall the cards, they mentally stroll through the house, visualizing each card in its spot as they go along. Counterintuitively, adding the memory burden of associating cards with locations makes recall stronger and more reliable.

The MIT team’s computational model was able to perform such tasks very well, suggesting that memory palaces take advantage of the memory circuit’s own strategy of associating inputs with a scaffold in the hippocampus, but one level down: Long-acquired memories reconstructed in the larger sensory cortex can now be pressed into service as a scaffold for new memories. This allows for the storage and recall of many more items in a sequence than would otherwise be possible.

The researchers now plan to build on their model to explore how episodic memories could become converted to cortical “semantic” memory, or the memory of facts dissociated from the specific context in which they were acquired (for example, Paris is the capital of France), how episodes are defined, and how brain-like memory models could be integrated into modern machine learning.

The research was funded by the U.S. Office of Naval Research, the National Science Foundation under the Robust Intelligence program, the ARO-MURI award, the Simons Foundation, and the K. Lisa Yang ICoN Center.

How the brain prevents us from falling

This post is adapted from an MIT research news story.

***

As we navigate the world, we adapt our movement in response to changes in the environment. From rocky terrain to moving escalators, we seamlessly modify our movements to maximize energy efficiency and our reduce risk of falling. The computational principles underlying this phenomenon, however, are not well understood.

In a recent paper published in the journal Nature Communications, MIT researchers proposed a model that explains how humans continuously adapt yet remain stable during complex tasks like walking.

“Much of our prior theoretical understanding of adaptation has been limited to episodic tasks, such as reaching for an object in a novel environment,” says senior author Nidhi Seethapathi, the Frederick A. (1971) and Carole J. Middleton Career Development Assistant Professor of Brain and Cognitive Sciences at MIT. “This new theoretical model captures adaptation phenomena in continuous long-horizon tasks in multiple locomotor settings.”

Barrett Clark, a robotics software engineer at Bright Minds Inc and and Manoj Srinivasan, an associate professor in the Department of Mechanical and Aerospace Engineering at Ohio State University, are also authors on the paper.

Principles of locomotor adaptation

In episodic tasks, like reaching for an object, errors during one episode do not affect the next episode. In tasks like locomotion, errors can have a cascade of short-term and long-term consequences to stability unless they are controlled. This makes the challenge of adapting locomotion in a new environment  more complex.

To build the model, the researchers identified general principles of locomotor adaptation across a variety of task settings, and  developed a unified modular and hierarchical model of locomotor adaptation, with each component having its own unique mathematical structure.

The resulting model successfully encapsulates how humans adapt their walking in novel settings such as on a split-belt treadmill with each foot at a different speed, wearing asymmetric leg weights, and wearing  an exoskeleton. The authors report that the model successfully reproduced human locomotor adaptation phenomena across novel settings in 10 prior studies and correctly predicted the adaptation behavior observed in two new experiments conducted as part of the study.

The model has potential applications in sensorimotor learning, rehabilitation, and wearable robotics.

“Having a model that can predict how a person will adapt to a new environment has immense utility for engineering better rehabilitation paradigms and wearable robot control,” says Seethapathi, who is also an associate investigator at MIT’s McGovern Institute. “You can think of a wearable robot itself as a new environment for the person to move in, and our model can be used to predict how a person will adapt for different robot settings. Understanding such human-robot adaptation is currently an experimentally intensive process, and our model  could help speed up the process by narrowing the search space.”

For healthy hearing, timing matters

When soundwaves reach the inner ear, neurons there pick up the vibrations and alert the brain. Encoded in their signals is a wealth of information that enables us to follow conversations, recognize familiar voices, appreciate music, and quickly locate a ringing phone or crying baby.

Seated man, smiling at camera
McGovern Institute Associate Investigator Josh McDermott. Photo: Justin Knight

Neurons send signals by emitting spikes—brief changes in voltage that propagate along nerve fibers, also known as action potentials. Remarkably, auditory neurons can fire hundreds of spikes per second, and time their spikes with exquisite precision to match the oscillations of incoming soundwaves.

With powerful new models of human hearing, scientists at MIT’s McGovern Institute have determined that this precise timing is vital for some of the most important ways we make sense of auditory information, including recognizing voices and localizing sounds.

The findings, reported December 4, 2024, in the journal Nature Communications, show how machine learning can help neuroscientists understand how the brain uses auditory information in the real world. McGovern Investigator Josh McDermott, who led the research, explains that his team’s models better equip researchers to study the consequences of different types of hearing impairment and devise more effective interventions.

Science of sound

The nervous system’s auditory signals are timed so precisely, researchers have long suspected that timing is important to our perception of sound. Soundwaves oscillate at rates that determine their pitch: low-pitched sounds travel in slow waves, whereas high-pitched sound waves oscillate more frequently. The auditory nerve that relays information from sound-detecting hair cells in the ear to the brain generates electrical spikes that corresponds to the frequency of these oscillations. “The action potentials in an auditory nerve get fired at very particular points in time relative to the peaks in the stimulus waveform,” explains McDermott, who is also an associate professor of brain and cognitive sciences at MIT.

This relationship, known as phase-locking, requires neurons to time their spikes with sub-millisecond precision. But scientists haven’t really known how informative these temporal patterns are to the brain. Beyond being scientifically intriguing, McDermott says, the question has important clinical implications: “If you want to design a prosthesis that provides electrical signals to the brain to reproduce the function of the ear, it’s arguably pretty important to know what kinds of information in the normal ear actually matter,” he says.

This has been difficult to study experimentally: Animal models can’t offer much insight into how the human brain extracts structure in language or music, and the auditory nerve is inaccessible for study in humans. So McDermott and graduate student Mark Saddler turned to artificial neural networks.

Artificial hearing

Neuroscientists have long used computational models to explore how sensory information might be decoded by the brain, but until recent advances in computing power and machine learning methods, these models were limited to simulating simple tasks. “One of the problems with these prior models is that they’re often way too good,” says Saddler, who is now at the Technical University of Denmark. For example, a computational model tasked with identifying the higher pitch in a pair of simple tones is likely to perform better than people who are asked to do the same thing. “This is not the kind of task that we do every day in hearing,” Saddler points out. “The brain is not optimized to solve this very artificial task.” This mismatch limited the insights that could be drawn from this prior generation of models.

To better understand the brain, Saddler and McDermott wanted to challenge a hearing model to do things that people use their hearing for in the real world, like recognizing words and voices. That meant developing an artificial neural network to simulate the parts of the brain that receive input from the ear. The network was given input from some 32,000 simulated sound-detecting sensory neurons and then optimized for various real-world tasks.

The researchers showed that their model replicated human hearing well—better than any previous model of auditory behavior, McDermott says. In one test, the artificial neural network was asked to recognize words and voices within dozens of types of background noise, from the hum of an airplane cabin to enthusiastic applause. Under every condition, the model performed very similarly to humans.

“The ability to link patterns of firing in the auditory nerve with behavior opens a lot of doors.” – Josh McDermott

When the team degraded the timing of the spikes in the simulated ear, however, their model could no longer match humans’ ability to recognize voices or identify the locations of sounds. For example, while McDermott’s team had previously shown that people use pitch to help them identify people’s voices, the model revealed that that this ability is lost without precisely timed signals. “You need quite precise spike timing in order to both account for human behavior and to perform well on the task,” Saddler says. That suggests that the brain uses precisely timed auditory signals because they aid these practical aspects of hearing.

The team’s findings demonstrate how artificial neural networks can help neuroscientists understand how the information extracted by the ear influences our perception of the world, both when hearing is intact and when it is impaired. “The ability to link patterns of firing in the auditory nerve with behavior opens a lot of doors,” McDermott says.

“Now that we have these models that link neural responses in the ear to auditory behavior, we can ask, ‘If we simulate different types of hearing loss, what effect is that going to have on our auditory abilities?’” McDermott says. “That will help us better diagnose hearing loss, and we think there are also extensions of that to help us design better hearing aids or cochlear implants.” For example, he says, “The cochlear implant is limited in various ways—it can do some things and not others. What’s the best way to set up that cochlear implant to enable you to mediate behaviors? You can, in principle, use the models to tell you that.”

Finding some stability in adaptable brains

One of the brain’s most celebrated qualities is its adaptability. Changes to neural circuits, whose connections are continually adjusted as we experience and interact with the world, are key to how we learn. But to keep knowledge and memories intact, some parts of the circuitry must be resistant to this constant change.

“Brains have figured out how to navigate this landscape of balancing between stability and flexibility, so that you can have new learning and you can have lifelong memory,” says neuroscientist Mark Harnett, an investigator at MIT’s McGovern Institute.

In the August 27, 2024 of the journal Cell Reports, Harnett and his team show how individual neurons can contribute to both parts of this vital duality. By studying the synapses through which pyramidal neurons in the brain’s sensory cortex communicate, they have learned how the cells preserve their understanding of some of the world’s most fundamental features, while also maintaining the flexibility they need to adapt to a changing world.

McGovern Institute Investigator Mark Harnett. Photo: Adam Glanzman

Visual connections

Pyramidal neurons receive input from other neurons via thousands of connection points. Early in life, these synapses are extremely malleable; their strength can shift as a young animal takes in visual information and learns to interpret it. Most remain adaptable into adulthood, but Harnett’s team discovered that some of the cells’ synapses lose their flexibility when the animals are less than a month old. Having both stable and flexible synapses means these neurons can combine input from different sources to use visual information in flexible ways.

Microscopic image of a mouse brain.
A confocal image of a mouse brain showing dLGN neurons in pink. Image: Courtney Yaeger, Mark Harnett.

Postdoctoral fellow Courtney Yaeger took a close look at these unusually stable synapses, which cluster together along a narrow region of the elaborately branched pyramidal cells. She was interested in the connections through which the cells receive primary visual information, so she traced their connections with neurons in a vision-processing center of the brain’s thalamus called the dorsal lateral geniculate nucleus (dLGN).

The long extensions through which a neuron receives signals from other cells are called dendrites, and they branch of from the main body of the cell into a tree-like structure. Spiny protrusions along the dendrites form the synapses that connect pyramidal neurons to other cells. Yaeger’s experiments showed that connections from the dLGN all led to a defined region of the pyramidal cells—a tight band within what she describes as the trunk of the dendritic tree.

Yaeger found several ways in which synapses in this region— formally known as the apical oblique dendrite domain—differ from other synapses on the same cells. “They’re not actually that far away from each other, but they have completely different properties,” she says.

Stable synapses

In one set of experiments, Yaeger activated synapses on the pyramidal neurons and measured the effect on the cells’ electrical potential. Changes to a neuron’s electrical potential generate the impulses the cells use to communicate with one another. It is common for a synapse’s electrical effects to amplify when synapses nearby are also activated. But when signals were delivered to the apical oblique dendrite domain, each one had the same effect, no matter how many synapses were stimulated. Synapses there don’t interact with one another at all, Harnett says. “They just do what they do. No matter what their neighbors are doing, they all just do kind of the same thing.”

Two rows of seven confocal microscope images of dendrites.
Representative oblique (top) and basal (bottom) dendrites from the same Layer 5 pyramidal neuron imaged across 7 days. Transient spines are labeled with yellow arrowheads the day before disappearance. Image: Courtney Yaeger, Mark Harnett.

The team was also able to visualize the molecular contents of individual synapses. This revealed a surprising lack of a certain kind of neurotransmitter receptor, called NMDA receptors, in the apical oblique dendrites. That was notable because of NMDA receptors’ role in mediating changes in the brain. “Generally when we think about any kind of learning and memory and plasticity, it’s NMDA receptors that do it,” Harnett says. “That is the by far most common substrate of learning and memory in all brains.”

When Yaeger stimulated the apical oblique synapses with electricity, generating patterns of activity that would strengthen most synapses, the team discovered a consequence of the limited presence of NMDA receptors. The synapses’ strength did not change. “There’s no activity-dependent plasticity going on there, as far as we have tested,” Yaeger says.

That makes sense, the researchers say, because the cells’ connections from the thalamus relay primary visual information detected by the eyes. It is through these connections that the brain learns to recognize basic visual features like shapes and lines.

“These synapses are basically a robust, high fidelity readout of this visual information,” Harnett explains. “That’s what they’re conveying, and it’s not context sensitive. So it doesn’t matter how many other synapses are active, they just do exactly what they’re going to do, and you can’t modify them up and down based on activity. So they’re very, very stable.”

“You actually don’t want those to be plastic,” adds Yaeger.

“Can you imagine going to sleep and then forgetting what a vertical line looks like? That would be disastrous.” – Courtney Yaeger

By conducting the same experiments in mice of different ages, the researchers determined that the synapses that connect pyramidal neurons to the thalamus become stable a few weeks after young mice first open their eyes. By that point, Harnett says, they have learned everything they need to learn. On the other hand, if mice spend the first weeks of their lives in the dark, the synapses never stabilize—further evidence that the transition depends on visual experience.

The team’s findings not only help explain how the brain balances flexibility and stability, they could help researchers teach artificial intelligence how to do the same thing. Harnett says artificial neural networks are notoriously bad at this: When an artificial neural network that does something well is trained to do something new, it almost always experiences “catastrophic forgetting” and can no longer perform its original task. Harnett’s team is exploring how they can use what they’ve learned about real brains to overcome this problem in artificial networks.

Finding the way

This story also appears in the Fall 2024 issue of BrainScan.

___

When you arrive in a new city, every outing can be an exploration. You may know your way to a few places, but only if you follow a specific route. As you wander around a bit, get lost a few times, and familiarize yourself with some landmarks and where they are relative to each other, your brain develops a cognitive map of the space. You learn how things are laid out, and navigating gets easier.

It takes a lot to generate a useful mental map. “You have to understand the structure of relationships in the world,” says McGovern Investigator Mehrdad Jazayeri. “You need learning and experience to construct clever representations. The advantage is that when you have them, the world is an easier place to deal with.”

Indeed, Jazayeri says, internal models like these are the core of intelligent behavior.

Mehrdad Jazayeri (right) and graduate student Jack Gabel sit inside a rig designed to probe the brain’s ability to solve real-world problems with internal models. Photo: Steph Stevens

Many McGovern scientists see these cognitive maps as windows into their biggest questions about the brain: how it represents the external world, how it lets us learn and adapt, and how it forms and reconstructs memories. Researchers are learning that cells and strategies that the brain uses to understand the layout of a space also help track other kinds of structures in the world, too — from variations in sound to sequences of events. By studying how neurons behave as animals navigate their environments, McGovern researchers also expect to deepen their understanding of other important cognitive functions as well.

Decoding spatial maps

McGovern Investigator Ila Fiete builds theoretical models that help explain how spatial maps are formed in the brain. Previous research has shown that “place cells” and “grid cells” are place-sensitive neurons in the brain’s hippocampus and entorhinal cortex whose firing patterns help an animal map out a space. As an animal becomes familiar with its environment, subsets of these cells become tied to specific locations, firing only when the animal is in them.

Microscopic image of the mouse hippocampus
The brain’s ability to navigate the world is made possible by a brain circuit that includes the hippocampus (above), entorhinal cortex, and retrosplenial cortex. The firing pattern of “grid cells” and “place cells” in this circuit help form mental representations, or cognitive maps, of the external world. These brain regions are also among the first areas to be affected in people with Alzheimer’s, who often have trouble navigating. Image: Qian Chen, Guoping Feng

Fiete’s models have shown how these circuits can integrate information about movement, like signals from the muscles and vestibular system that change as an animal moves around, to calculate and update its estimate of an animal’s position in space. Fiete suspects the cells that do this can use the same strategy to keep track of other kinds of movement or change.

Mapping a space is about understanding where things are in relationship to one another, says Jazayeri, and tracking relationships is useful for modeling many kinds of structure in the world. For example, the hippocampus and entorhinal cortex are also closely linked to episodic memory, which keeps track of the connections between events and experiences.

“These brain areas are thought to be critical for learning relationships,” Jazayeri says.

Navigating virtual worlds

A key feature of cognitive maps is that they enable us to make predictions and respond to new situations without relying on immediate sensory cues. In a study published in Nature this June, Jazayeri and Fiete saw evidence of the brain’s ability to call up an internal model of an abstract domain: they watched neurons in the brain’s entorhinal cortex register a sequence of images, even when they were hidden from view.

Two scientists write equations on a glass wall with a marker.
Ila Fiete and postdoc Sarthak Chandra (right) develop theoretical models to study the brain. Photo: Steph Stevens

We can remember the layout of our home from far away or plan a walk through the neighborhood without stepping outside — so it may come as no surprise that the brain can call up its internal model in the absence of movement or sensory inputs. Indeed, previous research has shown that the circuits that encode physical space also encode abstract spaces like auditory sound sequences. But these experiments were performed in the presence of the stimuli, and Jazayeri and his team wanted to know whether simply imagining movement through an abstract domain may also evoke the same cognitive maps.

To test the entorhinal cortex’s ability to do this, Jazayeri and his team designed an experiment where animals had to “mentally” navigate through a previously explored, but now invisible, sequence of images. Working with Fiete, they found that the neurons that had become responsive to particular images in the visible sequence would also fire when mentally navigating the sequence in which images were hidden from view — suggesting the animal was conjuring a representation of the image in its mind.

Colored dots in the shape of a ring.
Ila Fiete has shown that the brain generates a one-dimensional ring of neural activity that acts as a compass. Here, head direction is indicated by color. Image: Ila Fiete

“You see these neurons in the entorhinal cortex undergo very clear dynamic patterns that are in correspondence with what we think the animal might be thinking at the time,” Jazayeri says. “They are updating themselves without any change out there in the world.”

The team then incorporated their data into a computational model to explore how neural circuits might form a mental model of abstract sequences. Their artificial circuit showed that the external inputs (eg., image sequences) become associated with internal models through a simple associative learning rule in which neurons that fire together, wire together. This model suggests that imagined movement could update the internal representations, and the learned association of these internal representations with external inputs might enable a recall of the corresponding inputs even when they are absent.

More broadly, Fiete’s research on cognitive mapping in the hippocampus is leading to some interesting predictions: “One of the conclusions we’re coming to in my group is that when you reconstruct a memory, the area that’s driving that reconstruction is the entorhinal cortex and hippocampus but the reconstruction may happen in the sensory periphery, using the representations that played a role in experiencing that stimulus in the first place,” Fiete explains. “So when I reconstruct an image, I’m likely using my visual cortex to do that reconstruction, driven by the hippocampal complex.” Signals from the entorhinal cortex to the visual cortex during navigation could help an animal visualize landmarks and find its way, even when those landmarks are not visible in the external world.

Landmark coding

Near the entorhinal cortex is the retrosplenial cortex, another brain area that seems to be important for navigation. It is positioned to integrate visual signals with information about the body’s position and movement through space. Both the retrosplenial cortex and the entorhinal cortex are among the first areas impacted by Alzheimer’s disease; spatial disorientation and navigation difficulties may be consequences of their degeneration.

Researchers suspect the retrosplenial cortex may be key to letting an animal know not just where something is, but also how to get there. McGovern Investigator Mark Harnett explains that to generate a cognitive map that can be used to navigate, an animal must understand not just where objects or other cues are in relationship to itself, but also where they are in relationship to each other.

In a study reported in eLife in 2020, Harnett and colleagues may have glimpsed both of these kinds of representations of space inside the brain. They watched neurons there light up as mice ran on a treadmill and tracked the passage of a virtual environment. As the mice became familiar with the landscape and learned where they were likely to find a reward, activity in the retrosplenial cortex changed.

A scientist looks at a computer monitor and adjusts a small wheel.
Lukas Fischer, a Harnett lab postdoc, operates a rig designed to study how mice navigate a virtual environment. Photo: Justin Knight

“What we found was this representation started off sort of crude and mostly about what the animal was doing. And then eventually it became more about the task, the landscape, and the reward,” Harnett says.

Harnett’s team has since begun investigating how the retrosplenial cortex enables more complex spatial reasoning. They designed an experiment in which mice must understand many spatial relationships to access a treat. The experimental setup requires mice to consider the location of reward ports, the center of their environment, and their own viewing angle. Most of the time, they succeed. “They have to really do some triangulation, and the retrosplenial cortex seems to be critical for that,” Harnett says.

When the team monitored neural activity during the task, they found evidence that when an animal wasn’t quite sure where to go, its brain held on to multiple spatial hypotheses at the same time, until new information ruled one out.

Fiete, who has worked with Harnett to explore how neural circuits can execute this kind of spatial reasoning, points out that Jazayeri’s team has observed similar reasoning in animals that must make decisions based on temporarily ambiguous auditory cues. “In both cases, animals are able to hold multiple hypotheses in mind and do the inference,” she says. “Mark’s found that the retrosplenial cortex contains all the signals necessary to do that reasoning.”

Beyond spatial reasoning

As his team learns more about the how the brain creates and uses cognitive maps, Harnett hopes activity in the retrosplenial cortex will shed light on a fundamental aspect of the brain’s organization. The retrosplenial cortex doesn’t just receive information from the brain’s vision-processing center, it also sends signals back. He suspects these may direct the visual cortex to relay information that is particularly pertinent to forming or using a meaningful cognitive map.

“The brain’s navigation system is a beautiful playground.” – Ila Fiete

This kind of connectivity, where parts of the brain that carry out complex cognitive processing send signals back to regions that handle simpler functions, is common in the brain. Figuring out why is a key pursuit in Harnett’s lab. “I want to use that as a model for thinking about the larger cortical computations, because you see this kind of motif repeated in a lot of ways, and it’s likely key for understanding how learning works,” he says.

Fiete is particularly interested in unpacking the common set of principles that allow cell circuits to generate maps of both our physical environment and our abstract experiences. What is it about this set of brain areas and circuits that, on the one hand, permits specific map-building computations, and, on the other hand, generalizes across physical space and abstract experience?

“The brain’s navigation system is a beautiful playground,” she says, “and an amazing system in which to investigate all of these questions.”