Partnership with MIT Museum explores relationship between neuroscience and society

What does a healthy relationship between neuroscience and society look like? How do we set the conditions for that relationship to flourish? Researchers and staff at the McGovern Institute and the MIT Museum have been exploring these questions with a five-month planning grant from the Dana Foundation.

Between October 2022 and March 2023, the team tested the potential for an MIT Center for Neuroscience and Society through a series of MIT-sponsored events that were attended by students and faculty of nearby Cambridge Public Schools. The goal of the project was to learn more about what happens when the distinct fields of neuroscience, ethics, and public engagement are brought together to work side-by-side.

Researchers assist volunteer in mock MRI scanner
Gabrieli lab members Sadie Zacharek (left) and Shruti Nishith (right) demonstrate how the MRI mock scanner works with a student volunteer from the Cambridge Public Schools. Photo: Emma Skakel, MIT Museum

Middle schoolers visit McGovern

Over four days in February, more than 90 sixth graders from Rindge Avenue Upper Campus (RAUC) in Cambridge, Massachusetts, visited the McGovern Institute and participated in hands-on experiments and discussions about the ethical, legal, and social implications of neuroscience research. RAUC is one of four middle schools in the city of Cambridge with an economically, racially, and culturally diverse student population. The middle schoolers interacted with an MIT team led by McGovern Scientific Advisor Jill R. Crittenden, including seventeen McGovern neuroscientists, three MIT Museum outreach coordinators, and neuroethicist Stephanie Bird, a member of the Dana Foundation planning grant team.

“It is probably the only time in my life I will see a real human brain.” – RAUC student

The students participated in nine activities each day, including trials of brain-machine interfaces, close-up examinations of preserved human brains, a tour of McGovern’s imaging center in which students watched as their teacher’s brain was scanned, and a visit to the MIT Museum’s interactive Artificial Intelligence Gallery.

Imagine-IT, a brain-machine interface designed by a team of middle school students during a visit to the McGovern Institute.

To close out their visit, students worked in groups alongside experts to invent brain-computer interfaces designed to improve or enhance human abilities. At each step, students were introduced to ethical considerations through consent forms, questions regarding the use of animal and human brains, and the possible impacts of their own designs on individuals and society.

“I admit that prior to these four days, I would’ve been indifferent to the inclusion of children’s voices in a discussion about technically complex ethical questions, simply because they have not yet had any opportunity to really understand how these technologies work,” says one researcher involved in the visit. “But hearing the students’ questions and ideas has changed my perspective. I now believe it is critically important that all age groups be given a voice when discussing socially relevant issues, such as the ethics of brain computer interfaces or artificial intelligence.”

 

For more information on the proposed MIT Center for Neuroscience and Society, visit the MIT Museum website.

New insights into training dynamics of deep classifiers

A new study from researchers at MIT and Brown University characterizes several properties that emerge during the training of deep classifiers, a type of artificial neural network commonly used for classification tasks such as image classification, speech recognition, and natural language processing.

The paper, “Dynamics in Deep Classifiers trained with the Square Loss: Normalization, Low Rank, Neural Collapse and Generalization Bounds,” published today in the journal Research, is the first of its kind to theoretically explore the dynamics of training deep classifiers with the square loss and how properties such as rank minimization, neural collapse, and dualities between the activation of neurons and the weights of the layers are intertwined.

In the study, the authors focused on two types of deep classifiers: fully connected deep networks and convolutional neural networks (CNNs).

A previous study examined the structural properties that develop in large neural networks at the final stages of training. That study focused on the last layer of the network and found that deep networks trained to fit a training dataset will eventually reach a state known as “neural collapse.” When neural collapse occurs, the network maps multiple examples of a particular class (such as images of cats) to a single template of that class. Ideally, the templates for each class should be as far apart from each other as possible, allowing the network to accurately classify new examples.

An MIT group based at the MIT Center for Brains, Minds and Machines studied the conditions under which networks can achieve neural collapse. Deep networks that have the three ingredients of stochastic gradient descent (SGD), weight decay regularization (WD), and weight normalization (WN) will display neural collapse if they are trained to fit their training data. The MIT group has taken a theoretical approach — as compared to the empirical approach of the earlier study — proving that neural collapse emerges from the minimization of the square loss using SGD, WD, and WN.

Co-author and MIT McGovern Institute postdoc Akshay Rangamani states, “Our analysis shows that neural collapse emerges from the minimization of the square loss with highly expressive deep neural networks. It also highlights the key roles played by weight decay regularization and stochastic gradient descent in driving solutions towards neural collapse.”

Weight decay is a regularization technique that prevents the network from over-fitting the training data by reducing the magnitude of the weights. Weight normalization scales the weight matrices of a network so that they have a similar scale. Low rank refers to a property of a matrix where it has a small number of non-zero singular values. Generalization bounds offer guarantees about the ability of a network to accurately predict new examples that it has not seen during training.

The authors found that the same theoretical observation that predicts a low-rank bias also predicts the existence of an intrinsic SGD noise in the weight matrices and in the output of the network. This noise is not generated by the randomness of the SGD algorithm but by an interesting dynamic trade-off between rank minimization and fitting of the data, which provides an intrinsic source of noise similar to what happens in dynamic systems in the chaotic regime. Such a random-like search may be beneficial for generalization because it may prevent over-fitting.

“Interestingly, this result validates the classical theory of generalization showing that traditional bounds are meaningful. It also provides a theoretical explanation for the superior performance in many tasks of sparse networks, such as CNNs, with respect to dense networks,” comments co-author and MIT McGovern Institute postdoc Tomer Galanti. In fact, the authors prove new norm-based generalization bounds for CNNs with localized kernels, that is a network with sparse connectivity in their weight matrices.

In this case, generalization can be orders of magnitude better than densely connected networks. This result validates the classical theory of generalization, showing that its bounds are meaningful, and goes against a number of recent papers expressing doubts about past approaches to generalization. It also provides a theoretical explanation for the superior performance of sparse networks, such as CNNs, with respect to dense networks. Thus far, the fact that CNNs and not dense networks represent the success story of deep networks has been almost completely ignored by machine learning theory. Instead, the theory presented here suggests that this is an important insight in why deep networks work as well as they do.

“This study provides one of the first theoretical analyses covering optimization, generalization, and approximation in deep networks and offers new insights into the properties that emerge during training,” says co-author Tomaso Poggio, the Eugene McDermott Professor at the Department of Brain and Cognitive Sciences at MIT and co-director of the Center for Brains, Minds and Machines. “Our results have the potential to advance our understanding of why deep learning works as well as it does.”

Season’s Greetings from the McGovern Institute

This year’s holiday video (shown above) was inspired by Ev Fedorenko’s July 2022 Nature Neuroscience paper, which found similar patterns of brain activation and language selectivity across speakers of 45 different languages.

Universal language network

Ev Fedorenko uses the widely translated book “Alice in Wonderland” to test brain responses to different languages. Photo: Caitlin Cunningham

Over several decades, neuroscientists have created a well-defined map of the brain’s “language network,” or the regions of the brain that are specialized for processing language. Found primarily in the left hemisphere, this network includes regions within Broca’s area, as well as in other parts of the frontal and temporal lobes. Although roughly 7,000 languages are currently spoken and signed across the globe, the vast majority of those mapping studies have been done in English speakers as they listened to or read English texts.

To truly understand the cognitive and neural mechanisms that allow us to learn and process such diverse languages, Fedorenko and her team scanned the brains of speakers of 45 different languages while they listened to Alice in Wonderland in their native language. The results show that the speakers’ language networks appear to be essentially the same as those of native English speakers — which suggests that the location and key properties of the language network appear to be universal.

The many languages of McGovern

English may be the primary language used by McGovern researchers, but more than 35 other languages are spoken by scientists and engineers at the McGovern Institute. Our holiday video features 30 of these researchers saying Happy New Year in their native (or learned) language. Below is the complete list of languages included in our video. Expand each accordion to learn more about the speaker of that particular language and the meaning behind their new year’s greeting.

Brains on conlangs

For a few days in November, the McGovern Institute hummed with invented languages. Strangers greeted one another in Esperanto; trivia games were played in High Valyrian; Klingon and Na’vi were heard inside MRI scanners. Creators and users of these constructed languages (conlangs) had gathered at MIT in the name of neuroscience. McGovern Institute investigator Evelina Fedorenko and her team wanted to know what happened in their brains when they heard and understood these “foreign” tongues.

The constructed languages spoken by attendees had all been created for specific purposes. Most, like the Na’vi language spoken in the movie Avatar, had given identity and voice to the inhabitants of fictional worlds, while Esperanto was created to reduce barriers to international communication. But despite their distinct origins, a familiar pattern of activity emerged when researchers scanned speakers’ brains. The brain, they found, processes constructed languages with the same network of areas it uses for languages that evolved naturally over millions of years.

The meaning of language

“There’s all these things that people call language,” Fedorenko says. “Music is a kind of language and math is a kind of language.” But the brain processes these metaphorical languages differently than it does the languages humans use to communicate broadly about the world. To neuroscientists like Fedorenko, they can’t legitimately be considered languages at all. In contrast, she says, “these constructed languages seem really quite like natural languages.”

The “Brains on Conlangs” event that Fedorenko’s team hosted was part of its ongoing effort to understand the way language is generated and understood by the brain. Her lab and others have identified specific brain regions involved in linguistic processing, but it’s not yet clear how universal the language network is. Most studies of language cognition have focused on languages widely spoken in well-resourced parts of the world—primarily English, German, and Dutch. There are thousands of languages—spoken or signed—that have not been included.

Brain activation in a Klingon speaker while listening to English (left) and Klingon (right). Image: Saima Malik Moraleda

Fedorenko and her team are deliberately taking a broader approach. “If we’re making claims about language as a whole, it’s kind of weird to make it based on a handful of languages,” she says. “So we’re trying to create tools and collect some data on as many languages as possible.”

So far, they have found that the language networks used by native speakers of dozens of different languages do share key architectural similarities. And by including a more diverse set of languages in their research, Fedorenko and her team can begin to explore how the brain makes sense of linguistic features that are not part of English or other well studied languages. The Brains on Conlangs event was a chance to expand their studies even further.

Connecting conlangs

Nearly 50 speakers of Esperanto, Klingon, High Valyrian, Dothraki, and Na’vi attended Brains on Conlangs, drawn by the opportunity to connect with other speakers, hear from language creators, and contribute to the science. Graduate student Saima Malik-Moraleda and postbac research assistant Maya Taliaferro, along with other members of both the Fedorenko lab and brain and cognitive sciences professor Ted Gibson’s lab, and with help from Steve Shannon, Operations Manager of the Martinos Imaging Center, worked tirelessly to collect data from all participants. Two MRI scanners ran nearly continuously as speakers listened to passages in their chosen languages and researchers captured images of the brain’s response. To enable the research team to find the language-specific network in each person’s brain, participants also performed other tasks inside the scanner, including a memory task and listening to muffled audio in which the constructed languages were spoken, but unintelligible. They performed language tasks in English, as well.

To understand how the brain processes constructed languages (conlangs), McGovern Investigator Ev Fedorenko (center) gathered with conlang creators/speakers Marc Okrand (Klingon), Paul Frommer (Na’vi), Damian Blasi, Jessie Sams (méníshè), David Peterson (High Valyrian and Dothraki) and Aroka Okrent at the McGovern Institute for the “Brains on Colangs” event in November 2022. Photo: Elise Malvicini

Prior to the study, Fedorenko says, she had suspected constructed languages would activate the brain’s natural language-processing network, but she couldn’t be sure. Another possibility was that languages like Klingon and Esperanto would be handled instead by a problem-solving network known to be used when people work with some other so-called “languages,” like mathematics or computer programming. But once the data was in, the answer was clear. The five constructed languages included in the study all activated the brain’s language network.

That makes sense, Fedorenko says, because like natural languages, constructed languages enable people to communicate by associating words or signs with objects and ideas. Any language is essentially a way of mapping forms to meanings, she says. “You can construe it as a set of memories of how a particular sequence of sounds corresponds to some meaning. You’re learning meanings of words and constructions, and how to put them together to get more complex meanings. And it seems like the brain’s language system is very well suited for that set of computations.”

The ways we move

This story originally appeared in the Winter 2023 issue of BrainScan.
__

Many people barely consider how their bodies move — at least not until movement becomes more difficult due to injury or disease. But the McGovern scientists who are working to understand human movement and restore it after it has been lost know that the way we move is an engineering marvel.
Muscles, bones, brain, and nerves work together to navigate and interact with an ever-changing environment, making constant but often imperceptible adjustments to carry out our goals. It’s an efficient and highly adaptable system, and the way it’s put together is not at all intuitive, says Hugh Herr, a new associate investigator at the Institute.

That’s why Herr, who also co-directs MIT’s new K. Lisa Yang Center for Bionics, looks to biology to guide the development of artificial limbs that aim to give people the same agency, control, and comfort of natural limbs. McGovern Associate Investigator Nidhi Seethapathi, who like Herr joined the Institute in September, is also interested in understanding human movement in all its complexity. She is coming at the problem from a different direction, using computational modeling to predict how and why we move the way we do.

Moving through change

The computational models that Seethapathi builds in her lab aim to predict how humans will move under different conditions. If a person is placed in an unfamiliar environment and asked to navigate a course under time pressure, what path will they take? How will they move their limbs, and what forces will they exert? How will their movements change as they become more comfortable on the terrain?

McGovern Associate Investigator Nidhi Seethapathi with lab members (from left to right) Inseung Kang, Nikasha Patel, Antoine De Comite, Eric Wang, and Crista Falk. Photo: Steph Stevens

Seethapathi uses the principles of robotics to build models that answer these questions, then tests them by placing real people in the same scenarios and monitoring their movements. So far, that has mostly meant inviting study subjects to her lab, but as she expands her models to predict more complex movements, she will begin monitoring people’s activity in the real world, over longer time periods than laboratory experiments typically allow.

Seethapathi’s hope is that her findings will inform the way doctors, therapists, and engineers help patients regain control over their movements after an injury or stroke, or learn to live with movement disorders like Parkinson’s disease. To make a real difference, she stresses, it’s important to bring studies of human movement out of the lab, where subjects are often limited to simple tasks like walking on a treadmill, into more natural settings. “When we’re talking about doing physical therapy, neuromotor rehabilitation, robotic exoskeletons — any way of helping people move better — we want to do it in the real world, for everyday, complex tasks,” she says.

When we’re talking about helping people move better — we want to do it in the real world, for everyday, complex tasks,” says Seethapathi.

Seethapathi’s work is already revealing how the brain directs movement in the face of competing priorities. For example, she has found that when people are given a time constraint for traveling a particular distance, they walk faster than their usual, comfortable pace — so much so that they often expend more energy than necessary and arrive at their destination a bit early. Her models suggest that people pick up their pace more than they need to because humans’ internal estimations of time are imprecise.

Her team is also learning how movements change as a person becomes familiar with an environment or task. She says people find an efficient way to move through a lot of practice. “If you’re walking in a straight line for a very long time, then you seem to pick the movement that is optimal for that long-distance walk,” she explains. But in the real world, things are always changing — both in the body and in the environment. So Seethapathi models how people behave when they must move in a new way or navigate a new environment. “In these kinds of conditions, people eventually wind up on an energy-optimal solution,” she says. “But initially, they pick something that prevents them from falling down.”

To capture the complexity of human movement, Seethapathi and her team are devising new tools that will let them monitor people’s movements outside the lab. They are also drawing on data from other fields, from architecture to physical therapy, and even from studies of other animals. “If I have general principles, they should be able to tell me how modifications in the body or in how the brain is connected to the body would lead to different movements,” she says. “I’m really excited about generalizing these principles across timescales and species.”

Building new bodies

In Herr’s lab, a deepening understanding of human movement is helping drive the development of increasingly sophisticated artificial limbs and other wearable robots. The team designs devices that interface directly with a user’s nervous system, so they are not only guided by the brain’s motor control systems, but also send information back to the brain.

Herr, a double amputee with two artificial legs of his own, says prosthetic devices are getting better at replicating natural movements, guided by signals from the brain. Mimicking the design and neural signals found in biology can even give those devices much of the extraordinary adaptability of natural human movement. As an example, Herr notes that his legs effortlessly navigate varied terrain. “There’s adaptive, stabilizing features, and the machine doesn’t have to detect every pothole and pebble and banana peel on the ground, because the morphology and the nervous system control is so inherently adaptive,” he says.

McGovern Associate Investigator Hugh Herr at work in the K. Lisa Yang Center for Bionics at MIT. Photo: Jimmy Day/Media Lab

But, he notes, the field of bionics is in its infancy, and there’s lots of room for improvement. “It’s only a matter of time before a robotic knee, for example, can be as good as the biological knee or better,” he says. “But the problem is the human attached to that knee won’t feel it’s their knee until they can feel it, and until their central nervous system has complete agency over that knee,” he says. “So if you want to actually build new bodies and not just more and more powerful tools for humans, you have to link to the brain bidirectionally.”

Herr’s team has found that surgically restoring natural connections between pairs of muscles that normally work in opposition to move a limb, such as the arm’s biceps and triceps, gives the central nervous system signals about how that limb is moving, even when a natural limb is gone. The idea takes a cue from the work of McGovern Emeritus Investigator Emilio Bizzi, who found that the coordinated activation of groups of muscles by the nervous system, called muscle synergies, is important for motor control.

“It’s only a matter of time before a robotic knee can be as good as the biological knee or better,” says Herr.

“When a person thinks and moves their phantom limb, those muscle pairings move dynamically, so they feel, in a natural way, the limb moving — even though the limb is not there,” Herr explains. He adds that when those proprioceptive signals communicate instead how an artificial limb is moving, a person experiences “great agency and ownership” of that limb. Now, his group is working to develop sensors that detect and relay information usually processed by sensory neurons in the skin, so prosthetic devices can also perceive pressure and touch.

At the same time, they’re working to improve the mechanical interface between wearable robots and the body to optimize comfort and fit — whether that’s by using detailed anatomical imaging to guide the design of an individual’s device or by engineering devices that integrate directly with a person’s skeleton. There’s no “average” human, Herr says, and effective technologies must meet individual needs, not just for fit, but also for function. At that same time, he says it’s important to plan for cost-effective, mass production, because the need for these technologies is so great.

“The amount of human suffering caused by the lack of technology to address disability is really beyond comprehension,” he says. He expects tremendous progress in the growing field of bionics in the coming decades, but he’s impatient. “I think in 50 years, when scientists look back to this era, it’ll be laughable,” he says. “I’m always anxiously wanting to be in the future.”

Machine learning can predict bipolar disorder in children and teens

Bipolar disorder often begins in childhood or adolescence, triggering dramatic mood shifts and intense emotions that cause problems at home and school. But the condition is often overlooked or misdiagnosed until patients are older. New research suggests that machine learning, a type of artificial intelligence, could help by identifying children who are at risk of bipolar disorder so doctors are better prepared to recognize the condition if it develops.

On October 13, 2022, researchers led by McGovern Institute investigator John Gabrieli and collaborators at Massachusetts General Hospital reported in the Journal of Psychiatric Research that when presented with clinical data on nearly 500 children and teenagers, a machine learning model was able to identify about 75 percent of those who were later diagnosed with bipolar disorder. The approach performs better than any other method of predicting bipolar disorder, and could be used to develop a simple risk calculator for health care providers.

Gabrieli says such a tool would be particularly valuable because bipolar disorder is less common in children than conditions like major depression, with which it shares symptoms, and attention-deficit/ hyperactivity disorder (ADHD), with which it often co-occurs. “Humans are not well tuned to watch out for rare events,” he says. “If you have a decent measure, it’s so much easier for a machine to identify than humans. And in this particular case, [the machine learning prediction] was surprisingly robust.”

Detecting bipolar disorder

Mai Uchida, Director of Massachusetts General Hospital’s Child Depression Program, says that nearly two percent of youth worldwide are estimated to have bipolar disorder, but diagnosing pediatric bipolar disorder can be challenging. A certain amount of emotional turmoil is to be expected in children and teenagers, and even when moods become seriously disruptive, children with bipolar disorder are often initially diagnosed with major depression or ADHD. That’s a problem, because the medications used to treat those conditions often worsen the symptoms of bipolar disorder. Tailoring treatment to a diagnosis of bipolar disorder, in contrast, can lead to significant improvements for patients and their families. “When we can give them a little bit of ease and give them a little bit of control over themselves, it really goes a long way,” Uchida says.

In fact, a poor response to antidepressants or ADHD medications can help point a psychiatrist toward a diagnosis of bipolar disorder. So too can a child’s family history, in addition to their own behavior and psychiatric history. But, Uchida says, “it’s kind of up to the individual clinician to pick up on these things.”

Uchida and Gabrieli wondered whether machine learning, which can find patterns in large, complex datasets, could focus in on the most relevant features to identify individuals with bipolar disorder. To find out, they turned to data from a study that began in the 1990s. The study, headed by Joseph Biederman, Chief of the Clinical and Research Programs in Pediatric Psychopharmacology and Adult ADHD at Massachusetts General Hospital, had collected extensive psychiatric assessments of hundreds of children with and without ADHD, then followed those individuals for ten years.

To explore whether machine learning could find predictors of bipolar disorder within that data, Gabrieli, Uchida, and colleagues focused on 492 children and teenagers without ADHD, who were recruited to the study as controls. Over the ten years of the study, 45 of those individuals developed bipolar disorder.

Within the data collected at the study’s outset, the machine learning model was able to find patterns that associated with a later diagnosis of bipolar disorder. A few behavioral measures turned out to be particularly relevant to the model’s predictions: children and teens with combined problems with attention, aggression, and anxiety were most likely to later be diagnosed with bipolar disorder. These indicators were all picked up by a standard assessment tool called the Child Behavior Checklist.

Uchida and Gabrieli say the machine learning model could be integrated into the medical record system to help pediatricians and child psychiatrists catch early warning signs of bipolar disorder. “The information that’s collected could alert a clinician to the possibility of a bipolar disorder developing,” Uchida says. “Then at least they’re aware of the risk, and they may be able to maybe pick up on some of the deterioration when it’s happening and think about either referring them or treating it themselves.”

Ila Fiete wins Swartz Prize for Theoretical and Computational Neuroscience

The Society for Neuroscience (SfN) has awarded the Swartz Prize for Theoretical and Computational Neuroscience to Ila Fiete, professor in the Department of Brain and Cognitive Sciences, associate member of the McGovern Institute for Brain Research, and director of the K. Lisa Yang Integrative Computational Neuroscience Center. The SfN, the world’s largest neuroscience organization, announced that Fiete received the prize for her breakthrough research modeling hippocampal grid cells, a component of the navigational system of the mammalian brain.

“Fiete’s body of work has already significantly shaped the field of neuroscience and will continue to do so for the foreseeable future,” states the announcement from SfN.

“Fiete is considered one of the strongest theorists of her generation who has conducted highly influential work demonstrating that grid cell networks have attractor-like dynamics,” says Hollis Cline, a professor at the Scripps Research Institute of California and head of the Swartz Prize selection committee.

Grid cells are found in the cortex of all mammals. Their unique firing properties, creating a neural representation of our surroundings, allow us to navigate the world. Fiete and collaborators developed computational models showing how interactions between neurons can lead to the formation of periodic lattice-like firing patterns of grid cells and stabilize these patterns to create spatial memory. They showed that as we move around in space, these neural patterns can integrate velocity signals to provide a constantly updated estimate of our position, as well as detect and correct errors in the estimated position.

Fiete also proposed that multiple copies of these patterns at different spatial scales enabled efficient and high-capacity representation. Next, Fiete and colleagues worked with multiple collaborators to design experimental tests and establish rare evidence that these pattern-forming mechanisms underlie the function of memory pattern dynamics in the brain.

“I’m truly honored to receive the Swartz Prize,” says Fiete. “This prize recognizes my group’s efforts to decipher the circuit-level mechanisms of cognitive functions involving navigation, integration, and memory. It also recognizes, in its focus, the bearing-of-fruit of dynamical circuit models from my group and others that explain how individually simple elements combine to generate the longer-lasting memory states and complex computations of the brain. I am proud to be able to represent, in some measure, the work of my incredible students, postdocs, collaborators, and intellectual mentors. I am indebted to them and grateful for the chance to work together.”

According to the SfN announcement, Fiete has contributed to the field in many other ways, including modeling “how entorhinal cortex could interact with the hippocampus to efficiently and robustly store large numbers of memories and developed a remarkable method to discern the structure of intrinsic dynamics in neuronal circuits.” This modeling led to the discovery of an internal compass that tracks the direction of one’s head, even in the absence of external sensory input.

“Recently, Fiete’s group has explored the emergence of modular organization, a line of work that elucidates how grid cell modularity and general cortical modules might self-organize from smooth genetic gradients,” states the SfN announcement. Fiete and her research group have shown that even if the biophysical properties underlying grid cells of different scale are mostly similar, continuous variations in these properties can result in discrete groupings of grid cells, each with a different function.

Fiete was recognized with the Swartz Prize, which includes a $30,000 award, during the SfN annual meeting in San Diego.

Other recent MIT winners of the Swartz Prize include Professor Emery Brown (2020) and Professor Tomaso Poggio (2014).

Study urges caution when comparing neural networks to the brain

Neural networks, a type of computing system loosely modeled on the organization of the human brain, form the basis of many artificial intelligence systems for applications such speech recognition, computer vision, and medical image analysis.

In the field of neuroscience, researchers often use neural networks to try to model the same kind of tasks that the brain performs, in hopes that the models could suggest new hypotheses regarding how the brain itself performs those tasks. However, a group of researchers at MIT is urging that more caution should be taken when interpreting these models.

In an analysis of more than 11,000 neural networks that were trained to simulate the function of grid cells — key components of the brain’s navigation system — the researchers found that neural networks only produced grid-cell-like activity when they were given very specific constraints that are not found in biological systems.

“What this suggests is that in order to obtain a result with grid cells, the researchers training the models needed to bake in those results with specific, biologically implausible implementation choices,” says Rylan Schaeffer, a former senior research associate at MIT.

Without those constraints, the MIT team found that very few neural networks generated grid-cell-like activity, suggesting that these models do not necessarily generate useful predictions of how the brain works.

Schaeffer, who is now a graduate student in computer science at Stanford University, is the lead author of the new study, which will be presented at the 2022 Conference on Neural Information Processing Systems this month. Ila Fiete, a professor of brain and cognitive sciences and a member of MIT’s McGovern Institute for Brain Research, is the senior author of the paper. Mikail Khona, an MIT graduate student in physics, is also an author.

Ila Fiete leads a discussion in her lab at the McGovern Institute. Photo: Steph Stevens

Modeling grid cells

Neural networks, which researchers have been using for decades to perform a variety of computational tasks, consist of thousands or millions of processing units connected to each other. Each node has connections of varying strengths to other nodes in the network. As the network analyzes huge amounts of data, the strengths of those connections change as the network learns to perform the desired task.

In this study, the researchers focused on neural networks that have been developed to mimic the function of the brain’s grid cells, which are found in the entorhinal cortex of the mammalian brain. Together with place cells, found in the hippocampus, grid cells form a brain circuit that helps animals know where they are and how to navigate to a different location.

Place cells have been shown to fire whenever an animal is in a specific location, and each place cell may respond to more than one location. Grid cells, on the other hand, work very differently. As an animal moves through a space such as a room, grid cells fire only when the animal is at one of the vertices of a triangular lattice. Different groups of grid cells create lattices of slightly different dimensions, which overlap each other. This allows grid cells to encode a large number of unique positions using a relatively small number of cells.

This type of location encoding also makes it possible to predict an animal’s next location based on a given starting point and a velocity. In several recent studies, researchers have trained neural networks to perform this same task, which is known as path integration.

To train neural networks to perform this task, researchers feed into it a starting point and a velocity that varies over time. The model essentially mimics the activity of an animal roaming through a space, and calculates updated positions as it moves. As the model performs the task, the activity patterns of different units within the network can be measured. Each unit’s activity can be represented as a firing pattern, similar to the firing patterns of neurons in the brain.

In several previous studies, researchers have reported that their models produced units with activity patterns that closely mimic the firing patterns of grid cells. These studies concluded that grid-cell-like representations would naturally emerge in any neural network trained to perform the path integration task.

However, the MIT researchers found very different results. In an analysis of more than 11,000 neural networks that they trained on path integration, they found that while nearly 90 percent of them learned the task successfully, only about 10 percent of those networks generated activity patterns that could be classified as grid-cell-like. That includes networks in which even only a single unit achieved a high grid score.

The earlier studies were more likely to generate grid-cell-like activity only because of the constraints that researchers build into those models, according to the MIT team.

“Earlier studies have presented this story that if you train networks to path integrate, you’re going to get grid cells. What we found is that instead, you have to make this long sequence of choices of parameters, which we know are inconsistent with the biology, and then in a small sliver of those parameters, you will get the desired result,” Schaeffer says.

More biological models

One of the constraints found in earlier studies is that the researchers required the model to convert velocity into a unique position, reported by one network unit that corresponds to a place cell. For this to happen, the researchers also required that each place cell correspond to only one location, which is not how biological place cells work: Studies have shown that place cells in the hippocampus can respond to up to 20 different locations, not just one.

When the MIT team adjusted the models so that place cells were more like biological place cells, the models were still able to perform the path integration task, but they no longer produced grid-cell-like activity. Grid-cell-like activity also disappeared when the researchers instructed the models to generate different types of location output, such as location on a grid with X and Y axes, or location as a distance and angle relative to a home point.

“If the only thing that you ask this network to do is path integrate, and you impose a set of very specific, not physiological requirements on the readout unit, then it’s possible to obtain grid cells,” says Fiete, who is also the director of the K. Lisa Yang Integrative Computational Neuroscience Center at MIT. “But if you relax any of these aspects of this readout unit, that strongly degrades the ability of the network to produce grid cells. In fact, usually they don’t, even though they still solve the path integration task.”

Therefore, if the researchers hadn’t already known of the existence of grid cells, and guided the model to produce them, it would be very unlikely for them to appear as a natural consequence of the model training.

The researchers say that their findings suggest that more caution is warranted when interpreting neural network models of the brain.

“When you use deep learning models, they can be a powerful tool, but one has to be very circumspect in interpreting them and in determining whether they are truly making de novo predictions, or even shedding light on what it is that the brain is optimizing,” Fiete says.

Kenneth Harris, a professor of quantitative neuroscience at University College London, says he hopes the new study will encourage neuroscientists to be more careful when stating what can be shown by analogies between neural networks and the brain.

“Neural networks can be a useful source of predictions. If you want to learn how the brain solves a computation, you can train a network to perform it, then test the hypothesis that the brain works the same way. Whether the hypothesis is confirmed or not, you will learn something,” says Harris, who was not involved in the study. “This paper shows that ‘postdiction’ is less powerful: Neural networks have many parameters, so getting them to replicate an existing result is not as surprising.”

When using these models to make predictions about how the brain works, it’s important to take into account realistic, known biological constraints when building the models, the MIT researchers say. They are now working on models of grid cells that they hope will generate more accurate predictions of how grid cells in the brain work.

“Deep learning models will give us insight about the brain, but only after you inject a lot of biological knowledge into the model,” Khona says. “If you use the correct constraints, then the models can give you a brain-like solution.”

The research was funded by the Office of Naval Research, the National Science Foundation, the Simons Foundation through the Simons Collaboration on the Global Brain, and the Howard Hughes Medical Institute through the Faculty Scholars Program. Mikail Khona was supported by the MathWorks Science Fellowship.

Understanding reality through algorithms

Although Fernanda De La Torre still has several years left in her graduate studies, she’s already dreaming big when it comes to what the future has in store for her.

“I dream of opening up a school one day where I could bring this world of understanding of cognition and perception into places that would never have contact with this,” she says.

It’s that kind of ambitious thinking that’s gotten De La Torre, a doctoral student in MIT’s Department of Brain and Cognitive Sciences, to this point. A recent recipient of the prestigious Paul and Daisy Soros Fellowship for New Americans, De La Torre has found at MIT a supportive, creative research environment that’s allowed her to delve into the cutting-edge science of artificial intelligence. But she’s still driven by an innate curiosity about human imagination and a desire to bring that knowledge to the communities in which she grew up.

An unconventional path to neuroscience

De La Torre’s first exposure to neuroscience wasn’t in the classroom, but in her daily life. As a child, she watched her younger sister struggle with epilepsy. At 12, she crossed into the United States from Mexico illegally to reunite with her mother, exposing her to a whole new language and culture. Once in the States, she had to grapple with her mother’s shifting personality in the midst of an abusive relationship. “All of these different things I was seeing around me drove me to want to better understand how psychology works,” De La Torre says, “to understand how the mind works, and how it is that we can all be in the same environment and feel very different things.”

But finding an outlet for that intellectual curiosity was challenging. As an undocumented immigrant, her access to financial aid was limited. Her high school was also underfunded and lacked elective options. Mentors along the way, though, encouraged the aspiring scientist, and through a program at her school, she was able to take community college courses to fulfill basic educational requirements.

It took an inspiring amount of dedication to her education, but De La Torre made it to Kansas State University for her undergraduate studies, where she majored in computer science and math. At Kansas State, she was able to get her first real taste of research. “I was just fascinated by the questions they were asking and this entire space I hadn’t encountered,” says De La Torre of her experience working in a visual cognition lab and discovering the field of computational neuroscience.

Although Kansas State didn’t have a dedicated neuroscience program, her research experience in cognition led her to a machine learning lab led by William Hsu, a computer science professor. There, De La Torre became enamored by the possibilities of using computation to model the human brain. Hsu’s support also convinced her that a scientific career was a possibility. “He always made me feel like I was capable of tackling big questions,” she says fondly.

With the confidence imparted in her at Kansas State, De La Torre came to MIT in 2019 as a post-baccalaureate student in the lab of Tomaso Poggio, the Eugene McDermott Professor of Brain and Cognitive Sciences and an investigator at the McGovern Institute for Brain Research. With Poggio, also the director of the Center for Brains, Minds and Machines, De La Torre began working on deep-learning theory, an area of machine learning focused on how artificial neural networks modeled on the brain can learn to recognize patterns and learn.

“It’s a very interesting question because we’re starting to use them everywhere,” says De La Torre of neural networks, listing off examples from self-driving cars to medicine. “But, at the same time, we don’t fully understand how these networks can go from knowing nothing and just being a bunch of numbers to outputting things that make sense.”

Her experience as a post-bac was De La Torre’s first real opportunity to apply the technical computer skills she developed as an undergraduate to neuroscience. It was also the first time she could fully focus on research. “That was the first time that I had access to health insurance and a stable salary. That was, in itself, sort of life-changing,” she says. “But on the research side, it was very intimidating at first. I was anxious, and I wasn’t sure that I belonged here.”

Fortunately, De La Torre says she was able to overcome those insecurities, both through a growing unabashed enthusiasm for the field and through the support of Poggio and her other colleagues in MIT’s Department of Brain and Cognitive Sciences. When the opportunity came to apply to the department’s PhD program, she jumped on it. “It was just knowing these kinds of mentors are here and that they cared about their students,” says De La Torre of her decision to stay on at MIT for graduate studies. “That was really meaningful.”

Expanding notions of reality and imagination

In her two years so far in the graduate program, De La Torre’s work has expanded the understanding of neural networks and their applications to the study of the human brain. Working with Guangyu Robert Yang, an associate investigator at the McGovern Institute and an assistant professor in the departments of Brain and Cognitive Sciences and Electrical Engineering and Computer Sciences, she’s engaged in what she describes as more philosophical questions about how one develops a sense of self as an independent being. She’s interested in how that self-consciousness develops and why it might be useful.

De La Torre’s primary advisor, though, is Professor Josh McDermott, who leads the Laboratory for Computational Audition. With McDermott, De La Torre is attempting to understand how the brain integrates vision and sound. While combining sensory inputs may seem like a basic process, there are many unanswered questions about how our brains combine multiple signals into a coherent impression, or percept, of the world. Many of the questions are raised by audiovisual illusions in which what we hear changes what we see. For example, if one sees a video of two discs passing each other, but the clip contains the sound of a collision, the brain will perceive that the discs are bouncing off, rather than passing through each other. Given an ambiguous image, that simple auditory cue is all it takes to create a different perception of reality.

There’s something interesting happening where our brains are receiving two signals telling us different things and, yet, we have to combine them somehow to make sense of the world.

De La Torre is using behavioral experiments to probe how the human brain makes sense of multisensory cues to construct a particular perception. To do so, she’s created various scenes of objects interacting in 3D space over different sounds, asking research participants to describe characteristics of the scene. For example, in one experiment, she combines visuals of a block moving across a surface at different speeds with various scraping sounds, asking participants to estimate how rough the surface is. Eventually she hopes to take the experiment into virtual reality, where participants will physically push blocks in response to how rough they perceive the surface to be, rather than just reporting on what they experience.

Once she’s collected data, she’ll move into the modeling phase of the research, evaluating whether multisensory neural networks perceive illusions the way humans do. “What we want to do is model exactly what’s happening,” says De La Torre. “How is it that we’re receiving these two signals, integrating them and, at the same time, using all of our prior knowledge and inferences of physics to really make sense of the world?”

Although her two strands of research with Yang and McDermott may seem distinct, she sees clear connections between the two. Both projects are about grasping what artificial neural networks are capable of and what they tell us about the brain. At a more fundamental level, she says that how the brain perceives the world from different sensory cues might be part of what gives people a sense of self. Sensory perception is about constructing a cohesive, unitary sense of the world from multiple sources of sensory data. Similarly, she argues, “the sense of self is really a combination of actions, plans, goals, emotions, all of these different things that are components of their own, but somehow create a unitary being.”

It’s a fitting sentiment for De La Torre, who has been working to make sense of and integrate different aspects of her own life. Working in the Computational Audition lab, for example, she’s started experimenting with combining electronic music with folk music from her native Mexico, connecting her “two worlds,” as she says. Having the space to undertake those kinds of intellectual explorations, and colleagues who encourage it, is one of De La Torre’s favorite parts of MIT.

“Beyond professors, there’s also a lot of students whose way of thinking just amazes me,” she says. “I see a lot of goodness and excitement for science and a little bit of — it’s not nerdiness, but a love for very niche things — and I just kind of love that.”

Nidhi Seethapathi

Science in Motion

The computational models that Seethapathi builds in her lab aim to predict how humans will move under different conditions. If a person is placed in an unfamiliar environment and asked to navigate a course under time pressure, what path will they take? How will they move their limbs, and what forces will they exert? How will their movements change as they become more comfortable on the terrain?

Seethapathi uses the principles of robotics to build models that answer these questions, then tests them by placing real people in the same scenarios and monitoring their movements. Currently, most of these tests take place in her lab, where subjects are often limited to simple tasks like walking on a treadmill. As she expands her models to predict more complex movements, she will begin monitoring people’s activity in the real world, over longer time periods than laboratory experiments typically allow. Ultimately, Seethapathi hopes her findings will inform the way doctors, therapists, and engineers help patients regain control over their movements after an injury or due to a movement disorder.