Researchers present bold ideas for AI at MIT Generative AI Impact Consortium kickoff event

Launched in February of this year, the MIT Generative AI Impact Consortium (MGAIC), a presidential initiative led by MIT’s Office of Innovation and Strategy and administered by the MIT Stephen A. Schwarzman College of Computing, issued a call for proposals, inviting researchers from across MIT to submit ideas for innovative projects studying high-impact uses of generative AI models.

The call received 180 submissions from nearly 250 faculty members, spanning all of MIT’s five schools and the college. The overwhelming response across the Institute exemplifies the growing interest in AI and follows in the wake of MIT’s Generative AI Week and call for impact papers. Fifty-five proposals were selected for MGAIC’s inaugural seed grants, with several more selected to be funded by the consortium’s founding company members.

Over 30 funding recipients presented their proposals to the greater MIT community at a kickoff event on May 13. Anantha P. Chandrakasan, chief innovation and strategy officer and dean of the School of Engineering who is head of the consortium, welcomed the attendees and thanked the consortium’s founding industry members.

“The amazing response to our call for proposals is an incredible testament to the energy and creativity that MGAIC has sparked at MIT. We are especially grateful to our founding members, whose support and vision helped bring this endeavor to life,” adds Chandrakasan. “One of the things that has been most remarkable about MGAIC is that this is a truly cross-Institute initiative. Deans from all five schools and the college collaborated in shaping and implementing it.”

Vivek F. Farias, the Patrick J. McGovern (1959) Professor at the MIT Sloan School of Management and co-faculty director of the consortium with Tim Kraska, associate professor of electrical engineering and computer science in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), emceed the afternoon of five-minute lightning presentations.

Presentation highlights include:

“AI-Driven Tutors and Open Datasets for Early Literacy Education,” presented by Ola Ozernov-Palchik, a research scientist at the McGovern Institute for Brain Research, proposed a refinement for AI-tutors for pK-7 students to potentially decrease literacy disparities.

“Developing jam_bots: Real-Time Collaborative Agents for Live Human-AI Musical Improvisation,” presented by Anna Huang, assistant professor of music and assistant professor of electrical engineering and computer science, and Joe Paradiso, the Alexander W. Dreyfoos (1954) Professor in Media Arts and Sciences at the MIT Media Lab, aims to enhance human-AI musical collaboration in real-time for live concert improvisation.

“GENIUS: GENerative Intelligence for Urban Sustainability,” presented by Norhan Bayomi, a postdoc at the MIT Environmental Solutions Initiative and a research assistant in the Urban Metabolism Group, which aims to address the critical gap of a standardized approach in evaluating and benchmarking cities’ climate policies.

Georgia Perakis, the John C Head III Dean (Interim) of the MIT Sloan School of Management and professor of operations management, operations research, and statistics, who serves as co-chair of the GenAI Dean’s oversight group with Dan Huttenlocher, dean of the MIT Schwarzman College of Computing, ended the event with closing remarks that emphasized “the readiness and eagerness of our community to lead in this space.”

“This is only the beginning,” he continued. “We are at the front edge of a historic moment — one where MIT has the opportunity, and the responsibility, to shape the future of generative AI with purpose, with excellence, and with care.”

How the brain solves complicated problems

The human brain is very good at solving complicated problems. One reason for that is that humans can break problems apart into manageable subtasks that are easy to solve one at a time.

This allows us to complete a daily task like going out for coffee by breaking it into steps: getting out of our office building, navigating to the coffee shop, and once there, obtaining the coffee. This strategy helps us to handle obstacles easily. For example, if the elevator is broken, we can revise how we get out of the building without changing the other steps.

While there is a great deal of behavioral evidence demonstrating humans’ skill at these complicated tasks, it has been difficult to devise experimental scenarios that allow precise characterization of the computational strategies we use to solve problems.

In a new study, MIT researchers have successfully modeled how people deploy different decision-making strategies to solve a complicated task — in this case, predicting how a ball will travel through a maze when the ball is hidden from view. The human brain cannot perform this task perfectly because it is impossible to track all of the possible trajectories in parallel, but the researchers found that people can perform reasonably well by flexibly adopting two strategies known as hierarchical reasoning and counterfactual reasoning.

The researchers were also able to determine the circumstances under which people choose each of those strategies.

“What humans are capable of doing is to break down the maze into subsections, and then solve each step using relatively simple algorithms. Effectively, when we don’t have the means to solve a complex problem, we manage by using simpler heuristics that get the job done,” says Mehrdad Jazayeri, a professor of brain and cognitive sciences, a member of MIT’s McGovern Institute for Brain Research, an investigator at the Howard Hughes Medical Institute, and the senior author of the study.

Mahdi Ramadan PhD ’24 and graduate student Cheng Tang are the lead authors of the paper, which appears today in Nature Human Behavior. Nicholas Watters PhD ’25 is also a co-author.

Rational strategies

When humans perform simple tasks that have a clear correct answer, such as categorizing objects, they perform extremely well. When tasks become more complex, such as planning a trip to your favorite cafe, there may no longer be one clearly superior answer. And, at each step, there are many things that could go wrong. In these cases, humans are very good at working out a solution that will get the task done, even though it may not be the optimal solution.

Those solutions often involve problem-solving shortcuts, or heuristics. Two prominent heuristics humans commonly rely on are hierarchical and counterfactual reasoning. Hierarchical reasoning is the process of breaking down a problem into layers, starting from the general and proceeding toward specifics. Counterfactual reasoning involves imagining what would have happened if you had made a different choice. While these strategies are well-known, scientists don’t know much about how the brain decides which one to use in a given situation.

“This is really a big question in cognitive science: How do we problem-solve in a suboptimal way, by coming up with clever heuristics that we chain together in a way that ends up getting us closer and closer until we solve the problem?” Jazayeri says.

To overcome this, Jazayeri and his colleagues devised a task that is just complex enough to require these strategies, yet simple enough that the outcomes and the calculations that go into them can be measured.

The task requires participants to predict the path of a ball as it moves through four possible trajectories in a maze. Once the ball enters the maze, people cannot see which path it travels. At two junctions in the maze, they hear an auditory cue when the ball reaches that point. Predicting the ball’s path is a task that is impossible for humans to solve with perfect accuracy.

“It requires four parallel simulations in your mind, and no human can do that. It’s analogous to having four conversations at a time,” Jazayeri says. “The task allows us to tap into this set of algorithms that the humans use, because you just can’t solve it optimally.”

The researchers recruited about 150 human volunteers to participate in the study. Before each subject began the ball-tracking task, the researchers evaluated how accurately they could estimate timespans of several hundred milliseconds, about the length of time it takes the ball to travel along one arm of the maze.

For each participant, the researchers created computational models that could predict the patterns of errors that would be seen for that participant (based on their timing skill) if they were running parallel simulations, using hierarchical reasoning alone, counterfactual reasoning alone, or combinations of the two reasoning strategies.

The researchers compared the subjects’ performance with the models’ predictions and found that for every subject, their performance was most closely associated with a model that used hierarchical reasoning but sometimes switched to counterfactual reasoning.

That suggests that instead of tracking all the possible paths that the ball could take, people broke up the task. First, they picked the direction (left or right), in which they thought the ball turned at the first junction, and continued to track the ball as it headed for the next turn. If the timing of the next sound they heard wasn’t compatible with the path they had chosen, they would go back and revise their first prediction — but only some of the time.

Switching back to the other side, which represents a shift to counterfactual reasoning, requires people to review their memory of the tones that they heard. However, it turns out that these memories are not always reliable, and the researchers found that people decided whether to go back or not based on how good they believed their memory to be.

“People rely on counterfactuals to the degree that it’s helpful,” Jazayeri says. “People who take a big performance loss when they do counterfactuals avoid doing them. But if you are someone who’s really good at retrieving information from the recent past, you may go back to the other side.”

Human limitations

To further validate their results, the researchers created a machine-learning neural network and trained it to complete the task. A machine-learning model trained on this task will track the ball’s path accurately and make the correct prediction every time, unless the researchers impose limitations on its performance.

When the researchers added cognitive limitations similar to those faced by humans, they found that the model altered its strategies. When they eliminated the model’s ability to follow all possible trajectories, it began to employ hierarchical and counterfactual strategies like humans do. If the researchers reduced the model’s memory recall ability, it began to switch to hierarchical only if it thought its recall would be good enough to get the right answer — just as humans do.

“What we found is that networks mimic human behavior when we impose on them those computational constraints that we found in human behavior,” Jazayeri says. “This is really saying that humans are acting rationally under the constraints that they have to function under.”

By slightly varying the amount of memory impairment programmed into the models, the researchers also saw hints that the switching of strategies appears to happen gradually, rather than at a distinct cut-off point. They are now performing further studies to try to determine what is happening in the brain as these shifts in strategy occur.

The research was funded by a Lisa K. Yang ICoN Fellowship, a Friends of the McGovern Institute Student Fellowship, a National Science Foundation Graduate Research Fellowship, the Simons Foundation, the Howard Hughes Medical Institute, and the McGovern Institute.

A visual pathway in the brain may do more than recognize objects

When visual information enters the brain, it travels through two pathways that process different aspects of the input. For decades, scientists have hypothesized that one of these pathways, the ventral visual stream, is responsible for recognizing objects, and that it might have been optimized by evolution to do just that.

Consistent with this, in the past decade, MIT scientists have found that when computational models of the anatomy of the ventral stream are optimized to solve the task of object recognition, they are remarkably good predictors of the neural activities in the ventral stream.

However, in a new study, MIT researchers have shown that when they train these types of models on spatial tasks instead, the resulting models are also quite good predictors of the ventral stream’s neural activities. This suggests that the ventral stream may not be exclusively optimized for object recognition.

“This leaves wide open the question about what the ventral stream is being optimized for. I think the dominant perspective a lot of people in our field believe is that the ventral stream is optimized for object recognition, but this study provides a new perspective that the ventral stream could be optimized for spatial tasks as well,” says MIT graduate student Yudi Xie.

Xie is the lead author of the study, which will be presented at the International Conference on Learning Representations. Other authors of the paper include Weichen Huang, a visiting student through MIT’s Research Science Institute program; Esther Alter, a software engineer at the MIT Quest for Intelligence; Jeremy Schwartz, a sponsored research technical staff member; Joshua Tenenbaum, a professor of brain and cognitive sciences; and James DiCarlo, the Peter de Florez Professor of Brain and Cognitive Sciences, director of the Quest for Intelligence, and a member of the McGovern Institute for Brain Research at MIT.

Beyond object recognition

When we look at an object, our visual system can not only identify the object, but also determine other features such as its location, its distance from us, and its orientation in space. Since the early 1980s, neuroscientists have hypothesized that the primate visual system is divided into two pathways: the ventral stream, which performs object-recognition tasks, and the dorsal stream, which processes features related to spatial location.

Over the past decade, researchers have worked to model the ventral stream using a type of deep-learning model known as a convolutional neural network (CNN). Researchers can train these models to perform object-recognition tasks by feeding them datasets containing thousands of images along with category labels describing the images.

The state-of-the-art versions of these CNNs have high success rates at categorizing images. Additionally, researchers have found that the internal activations of the models are very similar to the activities of neurons that process visual information in the ventral stream. Furthermore, the more similar these models are to the ventral stream, the better they perform at object-recognition tasks. This has led many researchers to hypothesize that the dominant function of the ventral stream is recognizing objects.

However, experimental studies, especially a study from the DiCarlo lab in 2016, have found that the ventral stream appears to encode spatial features as well. These features include the object’s size, its orientation (how much it is rotated), and its location within the field of view. Based on these studies, the MIT team aimed to investigate whether the ventral stream might serve additional functions beyond object recognition.

“Our central question in this project was, is it possible that we can think about the ventral stream as being optimized for doing these spatial tasks instead of just categorization tasks?” Xie says.

To test this hypothesis, the researchers set out to train a CNN to identify one or more spatial features of an object, including rotation, location, and distance. To train the models, they created a new dataset of synthetic images. These images show objects such as tea kettles or calculators superimposed on different backgrounds, in locations and orientations that are labeled to help the model learn them.

The researchers found that CNNs that were trained on just one of these spatial tasks showed a high level of “neuro-alignment” with the ventral stream — very similar to the levels seen in CNN models trained on object recognition.

The researchers measure neuro-alignment using a technique that DiCarlo’s lab has developed, which involves asking the models, once trained, to predict the neural activity that a particular image would generate in the brain. The researchers found that the better the models performed on the spatial task they had been trained on, the more neuro-alignment they showed.

“I think we cannot assume that the ventral stream is just doing object categorization, because many of these other functions, such as spatial tasks, also can lead to this strong correlation between models’ neuro-alignment and their performance,” Xie says. “Our conclusion is that you can optimize either through categorization or doing these spatial tasks, and they both give you a ventral-stream-like model, based on our current metrics to evaluate neuro-alignment.”

Comparing models

The researchers then investigated why these two approaches — training for object recognition and training for spatial features — led to similar degrees of neuro-alignment. To do that, they performed an analysis known as centered kernel alignment (CKA), which allows them to measure the degree of similarity between representations in different CNNs. This analysis showed that in the early to middle layers of the models, the representations that the models learn are nearly indistinguishable.

“In these early layers, essentially you cannot tell these models apart by just looking at their representations,” Xie says. “It seems like they learn some very similar or unified representation in the early to middle layers, and in the later stages they diverge to support different tasks.”

The researchers hypothesize that even when models are trained to analyze just one feature, they also take into account “non-target” features — those that they are not trained on. When objects have greater variability in non-target features, the models tend to learn representations more similar to those learned by models trained on other tasks. This suggests that the models are using all of the information available to them, which may result in different models coming up with similar representations, the researchers say.

“More non-target variability actually helps the model learn a better representation, instead of learning a representation that’s ignorant of them,” Xie says. “It’s possible that the models, although they’re trained on one target, are simultaneously learning other things due to the variability of these non-target features.”

In future work, the researchers hope to develop new ways to compare different models, in hopes of learning more about how each one develops internal representations of objects based on differences in training tasks and training data.

“There could be still slight differences between these models, even though our current way of measuring how similar these models are to the brain tells us they’re on a very similar level. That suggests maybe there’s still some work to be done to improve upon how we can compare the model to the brain, so that we can better understand what exactly the ventral stream is optimized for,” Xie says.

The research was funded by the Semiconductor Research Corporation and the U.S. Defense Advanced Research Projects Agency.

Looking under the hood at the brain’s language system

As a young girl growing up in the former Soviet Union, Evelina Fedorenko PhD ’07 studied several languages, including English, as her mother hoped that it would give her the chance to eventually move abroad for better opportunities.

Her language studies not only helped her establish a new life in the United States as an adult, but also led to a lifelong interest in linguistics and how the brain processes language. Now an associate professor of brain and cognitive sciences at MIT, Fedorenko studies the brain’s language-processing regions: how they arise, whether they are shared with other mental functions, and how each region contributes to language comprehension and production.

Fedorenko’s early work helped to identify the precise locations of the brain’s language-processing regions, and she has been building on that work to generate insight into how different neuronal populations in those regions implement linguistic computations.

“It took a while to develop the approach and figure out how to quickly and reliably find these regions in individual brains, given this standard problem of the brain being a little different across people,” she says. “Then we just kept going, asking questions like: Does language overlap with other functions that are similar to it? How is the system organized internally? Do different parts of this network do different things? There are dozens and dozens of questions you can ask, and many directions that we have pushed on.”

Among some of the more recent directions, she is exploring how the brain’s language-processing regions develop early in life, through studies of very young children, people with unusual brain architecture, and computational models known as large language models.

From Russia to MIT

Fedorenko grew up in the Russian city of Volgograd, which was then part of the Soviet Union. When the Soviet Union broke up in 1991, her mother, a mechanical engineer, lost her job, and the family struggled to make ends meet.

“It was a really intense and painful time,” Fedorenko recalls. “But one thing that was always very stable for me is that I always had a lot of love, from my parents, my grandparents, and my aunt and uncle. That was really important and gave me the confidence that if I worked hard and had a goal, that I could achieve whatever I dreamed about.”

Fedorenko did work hard in school, studying English, French, German, Polish, and Spanish, and she also participated in math competitions. As a 15-year-old, she spent a year attending high school in Alabama, as part of a program that placed students from the former Soviet Union with American families. She had been thinking about applying to universities in Europe but changed her plans when she realized the American higher education system offered more academic flexibility.

After being admitted to Harvard University with a full scholarship, she returned to the United States in 1998 and earned her bachelor’s degree in psychology and linguistics, while also working multiple jobs to send money home to help her family.

While at Harvard, she also took classes at MIT and ended up deciding to apply to the Institute for graduate school. For her PhD research at MIT, she worked with Ted Gibson, a professor of brain and cognitive sciences, and later, Nancy Kanwisher, the Walter A. Rosenblith Professor of Cognitive Neuroscience. She began by using functional magnetic resonance imaging (fMRI) to study brain regions that appeared to respond preferentially to music, but she soon switched to studying brain responses to language.

She found that working with Kanwisher, who studies the functional organization of the human brain but hadn’t worked much on language before, helped Fedorenko to build a research program free of potential biases baked into some of the early work on language processing in the brain.

“We really kind of started from scratch,” Fedorenko says, “combining the knowledge of language processing I have gained by working with Gibson and the rigorous neuroscience approaches that Kanwisher had developed when studying the visual system.”

After finishing her PhD in 2007, Fedorenko stayed at MIT for a few years as a postdoc funded by the National Institutes of Health, continuing her research with Kanwisher. During that time, she and Kanwisher developed techniques to identify language-processing regions in different people, and discovered new evidence that certain parts of the brain respond selectively to language. Fedorenko then spent five years as a research faculty member at Massachusetts General Hospital, before receiving an offer to join the faculty at MIT in 2019.

How the brain processes language

Since starting her lab at MIT’s McGovern Institute for Brain Research, Fedorenko and her trainees have made several discoveries that have helped to refine neuroscientists’ understanding of the brain’s language-processing regions, which are spread across the left frontal and temporal lobes of the brain.

In a series of studies, her lab showed that these regions are highly selective for language and are not engaged by activities such as listening to music, reading computer code, or interpreting facial expressions, all of which have been argued to be share similarities with language processing.

“We’ve separated the language-processing machinery from various other systems, including the system for general fluid thinking, and the systems for social perception and reasoning, which support the processing of communicative signals, like facial expressions and gestures, and reasoning about others’ beliefs and desires,” Fedorenko says. “So that was a significant finding, that this system really is its own thing.”

More recently, Fedorenko has turned her attention to figuring out, in more detail, the functions of different parts of the language processing network. In one recent study, she identified distinct neuronal populations within these regions that appear to have different temporal windows for processing linguistic content, ranging from just one word up to six words.

She is also studying how language-processing circuits arise in the brain, with ongoing studies in which she and a postdoc in her lab are using fMRI to scan the brains of young children, observing how their language regions behave even before the children have fully learned to speak and understand language.

Large language models (similar to ChatGPT) can help with these types of developmental questions, as the researchers can better control the language inputs to the model and have continuous access to its abilities and representations at different stages of learning.

“You can train models in different ways, on different kinds of language, in different kind of regimens. For example, training on simpler language first and then more complex language, or on language combined with some visual inputs. Then you can look at the performance of these language models on different tasks, and also examine changes in their internal representations across the training trajectory, to test which model best captures the trajectory of human language learning,” Fedorenko says.

To gain another window into how the brain develops language ability, Fedorenko launched the Interesting Brains Project several years ago. Through this project, she is studying people who experienced some type of brain damage early in life, such as a prenatal stroke, or brain deformation as a result of a congenital cyst. In some of these individuals, their conditions destroyed or significantly deformed the brain’s typical language-processing areas, but all of these individuals are cognitively indistinguishable from individuals with typical brains: They still learned to speak and understand language normally, and in some cases, they didn’t even realize that their brains were in some way atypical until they were adults.

“That study is all about plasticity and redundancy in the brain, trying to figure out what brains can cope with, and how” Fedorenko says. “Are there many solutions to build a human mind, even when the neural infrastructure is so different-looking?”

To the brain, Esperanto and Klingon appear the same as English or Mandarin

Within the human brain, a network of regions has evolved to process language. These regions are consistently activated whenever people listen to their native language or any language in which they are proficient.

A new study by MIT researchers finds that this network also responds to languages that are completely invented, such as Esperanto, which was created in the late 1800s as a way to promote international communication, and even to languages made up for television shows such as “Star Trek” and “Game of Thrones.”

To study how the brain responds to these artificial languages, MIT neuroscientists convened nearly 50 speakers of these languages over a single weekend. Using functional magnetic resonance imaging (fMRI), the researchers found that when participants listened to a constructed language in which they were proficient, the same brain regions lit up as those activated when they processed their native language.

“We find that constructed languages very much recruit the same system as natural languages, which suggests that the key feature that is necessary to engage the system may have to do with the kinds of meanings that both kinds of languages can express,” says Evelina Fedorenko, an associate professor of neuroscience at MIT, a member of MIT’s McGovern Institute for Brain Research and the senior author of the study.

The findings help to define some of the key properties of language, the researchers say, and suggest that it’s not necessary for languages to have naturally evolved over a long period of time or to have a large number of speakers.

“It helps us narrow down this question of what a language is, and do it empirically, by testing how our brain responds to stimuli that might or might not be language-like,” says Saima Malik-Moraleda, an MIT postdoc and the lead author of the paper, which appears this week in the Proceedings of the National Academy of Sciences.

Convening the conlang community

Unlike natural languages, which evolve within communities and are shaped over time, constructed languages, or “conlangs,” are typically created by one person who decides what sounds will be used, how to label different concepts, and what the grammatical rules are.

Esperanto, the most widely spoken conlang, was created in 1887 by L.L. Zamenhof, who intended it to be used as a universal language for international communication. Currently, it is estimated that around 60,000 people worldwide are proficient in Esperanto.

In previous work, Fedorenko and her students have found that computer programming languages, such as Python — another type of invented language — do not activate the brain network that is used to process natural language. Instead, people who read computer code rely on the so-called multiple demand network, a brain system that is often recruited for difficult cognitive tasks.

Fedorenko and others have also investigated how the brain responds to other stimuli that share features with language, including music and nonverbal communication such as gestures and facial expressions.

“We spent a lot of time looking at all these various kinds of stimuli, finding again and again that none of them engage the language-processing mechanisms,” Fedorenko says. “So then the question becomes, what is it that natural languages have that none of those other systems do?”

That led the researchers to wonder if artificial languages like Esperanto would be processed more like programming languages or more like natural languages. Similar to programming languages, constructed languages are created by an individual for a specific purpose, without natural evolution within a community. However, unlike programming languages, both conlangs and natural languages can be used to convey meanings about the state of the external world or the speaker’s internal state.

To explore how the brain processes conlangs, the researchers invited speakers of Esperanto and several other constructed languages to MIT for a weekend conference in November 2022. The other languages included Klingon (from “Star Trek”), Na’vi (from “Avatar”), and two languages from “Game of Thrones” (High Valyrian and Dothraki). For all of these languages, there are texts available for people who want to learn the language, and for Esperanto, Klingon, and High Valyrian, there is even a Duolingo app available.

“It was a really fun event where all the communities came to participate, and over a weekend, we collected all the data,” says Malik-Moraleda, who co-led the data collection effort with former MIT postbac Maya Taliaferro, now a PhD student at New York University.

During that event, which also featured talks from several of the conlang creators, the researchers used fMRI to scan 44 conlang speakers as they listened to sentences from the constructed language in which they were proficient. The creators of these languages — who are co-authors on the paper — helped construct the sentences that were presented to the participants.

While in the scanner, the participants also either listened to or read sentences in their native language, and performed some nonlinguistic tasks for comparison. The researchers found that when people listened to a conlang, the same language regions in the brain were activated as when they listened to their native language.

Common features

The findings help to identify some of the key features that are necessary to recruit the brain’s language processing areas, the researchers say. One of the main characteristics driving language responses seems to be the ability to convey meanings about the interior and exterior world — a trait that is shared by natural and constructed languages, but not programming languages.

“All of the languages, both natural and constructed, express meanings related to inner and outer worlds. They refer to objects in the world, to properties of objects, to events,” Fedorenko says. “Whereas programming languages are much more similar to math. A programming language is a symbolic generative system that allows you to express complex meanings, but it’s a self-contained system: The meanings are highly abstract and mostly relational, and not connected to the real world that we experience.”

Some other characteristics of natural languages, which are not shared by constructed languages, don’t seem to be necessary to generate a response in the language network.

“It doesn’t matter whether the language is created and shaped over time by a community of speakers, because these constructed languages are not,” Malik-Moraleda says. “It doesn’t matter how old they are, because conlangs that are just a decade old engage the same brain regions as natural languages that have been around for many hundreds of years.”

To further refine the features of language that activate the brain’s language network, Fedorenko’s lab is now planning to study how the brain responds to a conlang called Lojban, which was created by the Logical Language Group in the 1990s and was designed to prevent ambiguity of meanings and promote more efficient communication.

The research was funded by MIT’s McGovern Institute for Brain Research, Brain and Cognitive Sciences Department, the Simons Center for the Social Brain, the Frederick A. and Carole J. Middleton Career Development Professorship, and the U.S. National Institutes of Health.

How nature organizes itself, from brain cells to ecosystems

McGovern Associate Investigator Ila Fiete. Photo: Caitlin Cunningham

Look around, and you’ll see it everywhere: the way trees form branches, the way cities divide into neighborhoods, the way the brain organizes into regions. Nature loves modularity—a limited number of self-contained units that combine in different ways to perform many functions. But how does this organization arise? Does it follow a detailed genetic blueprint, or can these structures emerge on their own?

A new study from McGovern Associate Investigator Ila Fiete suggests a surprising answer.

In findings published today in Nature, Fiete, a professor of brain and cognitive sciences and director of the K. Lisa Yang Integrative Computational Neuroscience (ICoN) Center at MIT, reports that a mathematical model called peak selection can explain how modules emerge without strict genetic instructions. Her team’s findings, which apply to brain systems and ecosystems, help explain how modularity occurs across nature, no matter the scale.

Joining two big ideas

“Scientists have debated how modular structures form. One hypothesis suggests that various genes are turned on at different locations to begin or end a structure. This explains how insect embryos develop body segments, with genes turning on or off at specific concentrations of a smooth chemical gradient in the insect egg,” says Fiete, who is the senior author of the paper. Mikail Khona, a former graduate student and K. Lisa Yang ICoN Center Graduate Fellow, and postdoctoral associate Sarthak Chandra also led the study.

Another idea, inspired by mathematician Alan Turing, suggests that a structure could emerge from competition—small-scale interactions can create repeating patterns, like the spots on a cheetah or the ripples in sand dunes.

Both ideas work well in some cases, but fail in others. The new research suggests that nature need not pick one approach over the other. The authors propose a simple mathematical principle called peak selection, showing that when a smooth gradient is paired with local interactions that are competitive, modular structures emerge naturally. “In this way, biological systems can organize themselves into sharp modules without detailed top-down instruction,” says Chandra.

Modular systems in the brain

The researchers tested their idea on grid cells, which play a critical role in spatial navigation as well as the storage of episodic memories. Grid cells fire in a repeating triangular pattern as animals move through space, but they don’t all work at the same scale—they are organized into distinct modules, each responsible for mapping space at slightly different resolutions.

A visual depiction of two different modules in grid cells, used to map space at slightly different resolutions. Image: Fiete Lab

No one knows how these modules form, but Fiete’s model shows that gradual variations in cellular properties along one dimension in the brain, combined with local neural interactions, could explain the entire structure. The grid cells naturally sort themselves into distinct groups with clear boundaries, without external maps or genetic programs telling them where to go. “Our work explains how grid cell modules could emerge. The explanation tips the balance toward the possibility of self-organization. It predicts that there might be no gene or intrinsic cell property that jumps when the grid cell scale jumps to another module,” notes Khona.

Modular systems in nature

The same principle applies beyond neuroscience. Imagine a landscape where temperatures and rainfall vary gradually over a space. You might expect species to be spread and also vary smoothly over this region. But in reality, ecosystems often form species clusters with sharp boundaries—distinct ecological “neighborhoods” that don’t overlap.

Fiete’s study suggests why: Local competition, cooperation, and predation between species interact with the global environmental gradients to create natural separations, even when the underlying conditions change gradually. This phenomenon can be explained using peak selection—and suggests that the same principle that shapes brain circuits could also be at play in forests and oceans.

A self-organizing world

One of the researchers’ most striking findings is that modularity in these systems is remarkably robust. Change the size of the system, and the number of modules stays the same, they just scale up or down. That means a mouse brain and a human brain could use the same fundamental rules to form their navigation circuits, just at different sizes.

The model also makes testable predictions. If it’s correct, grid cell modules should follow simple spacing ratios. In ecosystems, species distributions should form distinct clusters even without sharp environmental shifts.

Fiete notes that their work adds another conceptual framework to biology. “Peak selection can inform future experiments, not only in grid cell research but across developmental biology.”

Evelina Fedorenko receives Troland Award from National Academy of Sciences

The National Academy of Sciences (NAS) announced today that McGovern Investigator Evelina Fedorenko will receive a 2025 Troland Research Award for her groundbreaking contributions towards understanding the language network in the human brain.

The Troland Research Award is given annually to recognize unusual achievement by early-career researchers within the broad spectrum of experimental psychology.

Two women and one child looking at a computer screen.
McGovern Investigator Ev Fedorenko (center) looks at a young subject’s brain scan in the Martinos Imaging Center at MIT. Photo: Alexandra Sokhina

Fedorenko, who is an associate professor of brain and cognitive sciences at MIT, is interested in how minds and brains create language. Her lab is unpacking the internal architecture of the brain’s language system and exploring the relationship between language and various cognitive, perceptual, and motor systems.  Her novel methods combine precise measures of an individual’s brain organization with innovative computational modeling to make fundamental discoveries about the computations that underlie the uniquely human ability for language.

Fedorenko has shown that the language network is selective for language processing over diverse non-linguistic processes that have been argued to share computational demands with language, such as math, music, and social reasoning. Her work has also demonstrated that syntactic processing is not localized to a particular region within the language network, and every brain region that responds to syntactic processing is at least as sensitive to word meanings.

She has also shown that representations from neural network language models, such as ChatGPT, are similar to those in the human language brain areas. Fedorenko also highlighted that although language models can master linguistic rules and patterns, they are less effective at using language in real-world situations. In the human brain, that kind of functional competence is distinct from formal language competence, she says, requiring not just language-processing circuits but also brain areas that store knowledge of the world, reason, and interpret social interactions. Contrary to a prominent view that language is essential for thinking, Fedorenko argues that language is not the medium of thought and is primarily a tool for communication.

A probabilistic atlas of the human language network based on >800 individuals (center) and sample individual language networks, which illustrate inter-individual variability in the precise locations and shapes of the language areas. Image: Ev Fedorenko

Ultimately, Fedorenko’s cutting-edge work is uncovering the computations and representations that fuel language processing in the brain. She will receive the Troland Award this April, during the annual meeting of the NAS in Washington DC.

 

 

 

How one brain circuit encodes memories of both places and events

Nearly 50 years ago, neuroscientists discovered cells within the brain’s hippocampus that store memories of specific locations. These cells also play an important role in storing memories of events, known as episodic memories. While the mechanism of how place cells encode spatial memory has been well-characterized, it has remained a puzzle how they encode episodic memories.

A new model developed by MIT researchers explains how those place cells can be recruited to form episodic memories, even when there’s no spatial component. According to this model, place cells, along with grid cells found in the entorhinal cortex, act as a scaffold that can be used to anchor memories as a linked series.

“This model is a first-draft model of the entorhinal-hippocampal episodic memory circuit. It’s a foundation to build on to understand the nature of episodic memory. That’s the thing I’m really excited about,” says Ila Fiete, a professor of brain and cognitive sciences at MIT, a member of MIT’s McGovern Institute for Brain Research, and the senior author of the new study.

The model accurately replicates several features of biological memory systems, including the large storage capacity, gradual degradation of older memories, and the ability of people who compete in memory competitions to store enormous amounts of information in “memory palaces.”

MIT Research Scientist Sarthak Chandra and Sugandha Sharma PhD ’24 are the lead authors of the study, which appears today in Nature. Rishidev Chaudhuri, an assistant professor at the University of California at Davis, is also an author of the paper.

An index of memories

To encode spatial memory, place cells in the hippocampus work closely with grid cells — a special type of neuron that fires at many different locations, arranged geometrically in a regular pattern of repeating triangles. Together, a population of grid cells forms a lattice of triangles representing a physical space.

In addition to helping us recall places where we’ve been, these hippocampal-entorhinal circuits also help us navigate new locations. From human patients, it’s known that these circuits are also critical for forming episodic memories, which might have a spatial component but mainly consist of events, such as how you celebrated your last birthday or what you had for lunch yesterday.

“The same hippocampal and entorhinal circuits are used not just for spatial memory, but also for general episodic memory,” says Fiete, who is also the director of the K. Lisa Yang ICoN Center at MIT. “The question you can ask is what is the connection between spatial and episodic memory that makes them live in the same circuit?”

Two hypotheses have been proposed to account for this overlap in function. One is that the circuit is specialized to store spatial memories because those types of memories — remembering where food was located or where predators were seen — are important to survival. Under this hypothesis, this circuit encodes episodic memories as a byproduct of spatial memory.

An alternative hypothesis suggests that the circuit is specialized to store episodic memories, but also encodes spatial memory because location is one aspect of many episodic memories.

In this work, Fiete and her colleagues proposed a third option: that the peculiar tiling structure of grid cells and their interactions with hippocampus are equally important for both types of memory — episodic and spatial. To develop their new model, they built on computational models that her lab has been developing over the past decade, which mimic how grid cells encode spatial information.

“We reached the point where I felt like we understood on some level the mechanisms of the grid cell circuit, so it felt like the time to try to understand the interactions between the grid cells and the larger circuit that includes the hippocampus,” Fiete says.

In the new model, the researchers hypothesized that grid cells interacting with hippocampal cells can act as a scaffold for storing either spatial or episodic memory. Each activation pattern within the grid defines a “well,” and these wells are spaced out at regular intervals. The wells don’t store the content of a specific memory, but each one acts as a pointer to a specific memory, which is stored in the synapses between the hippocampus and the sensory cortex.

When the memory is triggered later from fragmentary pieces, grid and hippocampal cell interactions drive the circuit state into the nearest well, and the state at the bottom of the well connects to the appropriate part of the sensory cortex to fill in the details of the memory. The sensory cortex is much larger than the hippocampus and can store vast amounts of memory.

“Conceptually, we can think about the hippocampus as a pointer network. It’s like an index that can be pattern-completed from a partial input, and that index then points toward sensory cortex, where those inputs were experienced in the first place,” Fiete says. “The scaffold doesn’t contain the content, it only contains this index of abstract scaffold states.”

Furthermore, events that occur in sequence can be linked together: Each well in the grid cell-hippocampal network efficiently stores the information that is needed to activate the next well, allowing memories to be recalled in the right order.

Modeling memory cliffs and palaces

The researchers’ new model replicates several memory-related phenomena much more accurately than existing models that are based on Hopfield networks — a type of neural network that can store and recall patterns.

While Hopfield networks offer insight into how memories can be formed by strengthening connections between neurons, they don’t perfectly model how biological memory works. In Hopfield models, every memory is recalled in perfect detail until capacity is reached. At that point, no new memories can form, and worse, attempting to add more memories erases all prior ones. This “memory cliff” doesn’t accurately mimic what happens in the biological brain, which tends to gradually forget the details of older memories while new ones are continually added.

The new MIT model captures findings from decades of recordings of grid and hippocampal cells in rodents made as the animals explore and forage in various environments. It also helps to explain the underlying mechanisms for a memorization strategy known as a memory palace. One of the tasks in memory competitions is to memorize the shuffled sequence of cards in one or several card decks. They usually do this by assigning each card to a particular spot in a memory palace — a memory of a childhood home or other environment they know well. When they need to recall the cards, they mentally stroll through the house, visualizing each card in its spot as they go along. Counterintuitively, adding the memory burden of associating cards with locations makes recall stronger and more reliable.

The MIT team’s computational model was able to perform such tasks very well, suggesting that memory palaces take advantage of the memory circuit’s own strategy of associating inputs with a scaffold in the hippocampus, but one level down: Long-acquired memories reconstructed in the larger sensory cortex can now be pressed into service as a scaffold for new memories. This allows for the storage and recall of many more items in a sequence than would otherwise be possible.

The researchers now plan to build on their model to explore how episodic memories could become converted to cortical “semantic” memory, or the memory of facts dissociated from the specific context in which they were acquired (for example, Paris is the capital of France), how episodes are defined, and how brain-like memory models could be integrated into modern machine learning.

The research was funded by the U.S. Office of Naval Research, the National Science Foundation under the Robust Intelligence program, the ARO-MURI award, the Simons Foundation, and the K. Lisa Yang ICoN Center.

How the brain prevents us from falling

This post is adapted from an MIT research news story.

***

As we navigate the world, we adapt our movement in response to changes in the environment. From rocky terrain to moving escalators, we seamlessly modify our movements to maximize energy efficiency and our reduce risk of falling. The computational principles underlying this phenomenon, however, are not well understood.

In a recent paper published in the journal Nature Communications, MIT researchers proposed a model that explains how humans continuously adapt yet remain stable during complex tasks like walking.

“Much of our prior theoretical understanding of adaptation has been limited to episodic tasks, such as reaching for an object in a novel environment,” says senior author Nidhi Seethapathi, the Frederick A. (1971) and Carole J. Middleton Career Development Assistant Professor of Brain and Cognitive Sciences at MIT. “This new theoretical model captures adaptation phenomena in continuous long-horizon tasks in multiple locomotor settings.”

Barrett Clark, a robotics software engineer at Bright Minds Inc and and Manoj Srinivasan, an associate professor in the Department of Mechanical and Aerospace Engineering at Ohio State University, are also authors on the paper.

Principles of locomotor adaptation

In episodic tasks, like reaching for an object, errors during one episode do not affect the next episode. In tasks like locomotion, errors can have a cascade of short-term and long-term consequences to stability unless they are controlled. This makes the challenge of adapting locomotion in a new environment  more complex.

To build the model, the researchers identified general principles of locomotor adaptation across a variety of task settings, and  developed a unified modular and hierarchical model of locomotor adaptation, with each component having its own unique mathematical structure.

The resulting model successfully encapsulates how humans adapt their walking in novel settings such as on a split-belt treadmill with each foot at a different speed, wearing asymmetric leg weights, and wearing  an exoskeleton. The authors report that the model successfully reproduced human locomotor adaptation phenomena across novel settings in 10 prior studies and correctly predicted the adaptation behavior observed in two new experiments conducted as part of the study.

The model has potential applications in sensorimotor learning, rehabilitation, and wearable robotics.

“Having a model that can predict how a person will adapt to a new environment has immense utility for engineering better rehabilitation paradigms and wearable robot control,” says Seethapathi, who is also an associate investigator at MIT’s McGovern Institute. “You can think of a wearable robot itself as a new environment for the person to move in, and our model can be used to predict how a person will adapt for different robot settings. Understanding such human-robot adaptation is currently an experimentally intensive process, and our model  could help speed up the process by narrowing the search space.”

For healthy hearing, timing matters

When soundwaves reach the inner ear, neurons there pick up the vibrations and alert the brain. Encoded in their signals is a wealth of information that enables us to follow conversations, recognize familiar voices, appreciate music, and quickly locate a ringing phone or crying baby.

Seated man, smiling at camera
McGovern Institute Associate Investigator Josh McDermott. Photo: Justin Knight

Neurons send signals by emitting spikes—brief changes in voltage that propagate along nerve fibers, also known as action potentials. Remarkably, auditory neurons can fire hundreds of spikes per second, and time their spikes with exquisite precision to match the oscillations of incoming soundwaves.

With powerful new models of human hearing, scientists at MIT’s McGovern Institute have determined that this precise timing is vital for some of the most important ways we make sense of auditory information, including recognizing voices and localizing sounds.

The findings, reported December 4, 2024, in the journal Nature Communications, show how machine learning can help neuroscientists understand how the brain uses auditory information in the real world. McGovern Investigator Josh McDermott, who led the research, explains that his team’s models better equip researchers to study the consequences of different types of hearing impairment and devise more effective interventions.

Science of sound

The nervous system’s auditory signals are timed so precisely, researchers have long suspected that timing is important to our perception of sound. Soundwaves oscillate at rates that determine their pitch: low-pitched sounds travel in slow waves, whereas high-pitched sound waves oscillate more frequently. The auditory nerve that relays information from sound-detecting hair cells in the ear to the brain generates electrical spikes that corresponds to the frequency of these oscillations. “The action potentials in an auditory nerve get fired at very particular points in time relative to the peaks in the stimulus waveform,” explains McDermott, who is also an associate professor of brain and cognitive sciences at MIT.

This relationship, known as phase-locking, requires neurons to time their spikes with sub-millisecond precision. But scientists haven’t really known how informative these temporal patterns are to the brain. Beyond being scientifically intriguing, McDermott says, the question has important clinical implications: “If you want to design a prosthesis that provides electrical signals to the brain to reproduce the function of the ear, it’s arguably pretty important to know what kinds of information in the normal ear actually matter,” he says.

This has been difficult to study experimentally: Animal models can’t offer much insight into how the human brain extracts structure in language or music, and the auditory nerve is inaccessible for study in humans. So McDermott and graduate student Mark Saddler turned to artificial neural networks.

Artificial hearing

Neuroscientists have long used computational models to explore how sensory information might be decoded by the brain, but until recent advances in computing power and machine learning methods, these models were limited to simulating simple tasks. “One of the problems with these prior models is that they’re often way too good,” says Saddler, who is now at the Technical University of Denmark. For example, a computational model tasked with identifying the higher pitch in a pair of simple tones is likely to perform better than people who are asked to do the same thing. “This is not the kind of task that we do every day in hearing,” Saddler points out. “The brain is not optimized to solve this very artificial task.” This mismatch limited the insights that could be drawn from this prior generation of models.

To better understand the brain, Saddler and McDermott wanted to challenge a hearing model to do things that people use their hearing for in the real world, like recognizing words and voices. That meant developing an artificial neural network to simulate the parts of the brain that receive input from the ear. The network was given input from some 32,000 simulated sound-detecting sensory neurons and then optimized for various real-world tasks.

The researchers showed that their model replicated human hearing well—better than any previous model of auditory behavior, McDermott says. In one test, the artificial neural network was asked to recognize words and voices within dozens of types of background noise, from the hum of an airplane cabin to enthusiastic applause. Under every condition, the model performed very similarly to humans.

“The ability to link patterns of firing in the auditory nerve with behavior opens a lot of doors.” – Josh McDermott

When the team degraded the timing of the spikes in the simulated ear, however, their model could no longer match humans’ ability to recognize voices or identify the locations of sounds. For example, while McDermott’s team had previously shown that people use pitch to help them identify people’s voices, the model revealed that that this ability is lost without precisely timed signals. “You need quite precise spike timing in order to both account for human behavior and to perform well on the task,” Saddler says. That suggests that the brain uses precisely timed auditory signals because they aid these practical aspects of hearing.

The team’s findings demonstrate how artificial neural networks can help neuroscientists understand how the information extracted by the ear influences our perception of the world, both when hearing is intact and when it is impaired. “The ability to link patterns of firing in the auditory nerve with behavior opens a lot of doors,” McDermott says.

“Now that we have these models that link neural responses in the ear to auditory behavior, we can ask, ‘If we simulate different types of hearing loss, what effect is that going to have on our auditory abilities?’” McDermott says. “That will help us better diagnose hearing loss, and we think there are also extensions of that to help us design better hearing aids or cochlear implants.” For example, he says, “The cochlear implant is limited in various ways—it can do some things and not others. What’s the best way to set up that cochlear implant to enable you to mediate behaviors? You can, in principle, use the models to tell you that.”