New insights into training dynamics of deep classifiers

A new study from researchers at MIT and Brown University characterizes several properties that emerge during the training of deep classifiers, a type of artificial neural network commonly used for classification tasks such as image classification, speech recognition, and natural language processing.

The paper, “Dynamics in Deep Classifiers trained with the Square Loss: Normalization, Low Rank, Neural Collapse and Generalization Bounds,” published today in the journal Research, is the first of its kind to theoretically explore the dynamics of training deep classifiers with the square loss and how properties such as rank minimization, neural collapse, and dualities between the activation of neurons and the weights of the layers are intertwined.

In the study, the authors focused on two types of deep classifiers: fully connected deep networks and convolutional neural networks (CNNs).

A previous study examined the structural properties that develop in large neural networks at the final stages of training. That study focused on the last layer of the network and found that deep networks trained to fit a training dataset will eventually reach a state known as “neural collapse.” When neural collapse occurs, the network maps multiple examples of a particular class (such as images of cats) to a single template of that class. Ideally, the templates for each class should be as far apart from each other as possible, allowing the network to accurately classify new examples.

An MIT group based at the MIT Center for Brains, Minds and Machines studied the conditions under which networks can achieve neural collapse. Deep networks that have the three ingredients of stochastic gradient descent (SGD), weight decay regularization (WD), and weight normalization (WN) will display neural collapse if they are trained to fit their training data. The MIT group has taken a theoretical approach — as compared to the empirical approach of the earlier study — proving that neural collapse emerges from the minimization of the square loss using SGD, WD, and WN.

Co-author and MIT McGovern Institute postdoc Akshay Rangamani states, “Our analysis shows that neural collapse emerges from the minimization of the square loss with highly expressive deep neural networks. It also highlights the key roles played by weight decay regularization and stochastic gradient descent in driving solutions towards neural collapse.”

Weight decay is a regularization technique that prevents the network from over-fitting the training data by reducing the magnitude of the weights. Weight normalization scales the weight matrices of a network so that they have a similar scale. Low rank refers to a property of a matrix where it has a small number of non-zero singular values. Generalization bounds offer guarantees about the ability of a network to accurately predict new examples that it has not seen during training.

The authors found that the same theoretical observation that predicts a low-rank bias also predicts the existence of an intrinsic SGD noise in the weight matrices and in the output of the network. This noise is not generated by the randomness of the SGD algorithm but by an interesting dynamic trade-off between rank minimization and fitting of the data, which provides an intrinsic source of noise similar to what happens in dynamic systems in the chaotic regime. Such a random-like search may be beneficial for generalization because it may prevent over-fitting.

“Interestingly, this result validates the classical theory of generalization showing that traditional bounds are meaningful. It also provides a theoretical explanation for the superior performance in many tasks of sparse networks, such as CNNs, with respect to dense networks,” comments co-author and MIT McGovern Institute postdoc Tomer Galanti. In fact, the authors prove new norm-based generalization bounds for CNNs with localized kernels, that is a network with sparse connectivity in their weight matrices.

In this case, generalization can be orders of magnitude better than densely connected networks. This result validates the classical theory of generalization, showing that its bounds are meaningful, and goes against a number of recent papers expressing doubts about past approaches to generalization. It also provides a theoretical explanation for the superior performance of sparse networks, such as CNNs, with respect to dense networks. Thus far, the fact that CNNs and not dense networks represent the success story of deep networks has been almost completely ignored by machine learning theory. Instead, the theory presented here suggests that this is an important insight in why deep networks work as well as they do.

“This study provides one of the first theoretical analyses covering optimization, generalization, and approximation in deep networks and offers new insights into the properties that emerge during training,” says co-author Tomaso Poggio, the Eugene McDermott Professor at the Department of Brain and Cognitive Sciences at MIT and co-director of the Center for Brains, Minds and Machines. “Our results have the potential to advance our understanding of why deep learning works as well as it does.”

School of Science presents 2023 Infinite Expansion Awards

The MIT School of Science has announced seven postdocs and research scientists as recipients of the 2023 Infinite Expansion Award. Nominated by their peers and mentors, the awardees are recognized not only for their exceptional science, but for mentoring and advising junior colleagues, supporting educational programs, working with the MIT Postdoctoral Association, or contributing some other way to the Institute.

The 2023 Infinite Expansion award winners in the School of Science are:

  • Kyle Jenks, a postdoc in the Picower Institute for Learning and Memory, nominated by professor and Picower Institute investigator Mriganka Sur;
  • Matheus Victor, a postdoc in the Picower Institute, nominated by professor and Picower Institute director Li-Huei Tsai.

A monetary award is granted to recipients, and a celebratory reception will be held for the winners this spring with family, friends, nominators, and recipients of the Infinite Expansion Award.

Studies of unusual brains reveal critical insights into brain organization, function

EG (a pseudonym) is an accomplished woman in her early 60s: she is a college graduate and has an advanced professional degree. She has a stellar vocabulary—in the 98th percentile, according to tests—and has mastered a foreign language (Russian) to the point that she sometimes dreams in it.

She also has, likely since birth, been missing her left temporal lobe, a part of the brain known to be critical for language.

In 2016, EG contacted McGovern Institute Investigator Evelina Fedorenko, who studies the computations and brain regions that underlie language processing, to see if her team might be interested in including her in their research.

“EG didn’t know about her missing temporal lobe until age 25, when she had a brain scan for an unrelated reason,” says Fedorenko, the Frederick A. (1971) and Carole J. Middleton Career Development Associate Professor of Neuroscience at MIT. “As with many cases of early brain damage, she had no linguistic or cognitive deficits, but brains like hers are invaluable for understanding how cognitive functions reorganize in the tissue that remains.”

“I told her we definitely wanted to study her brain.” – Ev Fedorenko

Previous studies have shown that language processing relies on an interconnected network of frontal and temporal regions in the left hemisphere of the brain. EG’s unique brain presented an opportunity for Fedorenko’s team to explore how language develops in the absence of the temporal part of these core language regions.

Greta Tuckute, a graduate student in the Fedorenko lab, is the first author of the Neuropsychologia study. Photo: Caitlin Cunningham

Their results appeared recently in the journal Neuropsychologia. They found, for the first time, that temporal language regions appear to be critical for the emergence of frontal language regions in the same hemisphere — meaning, without a left temporal lobe, EG’s intact frontal lobe did not develop a capacity for language.

They also reveal much more: EG’s language system resides happily in her right hemisphere. “Our findings provide both visual and statistical proof of the brain’s remarkable plasticity, its ability to reorganize, in the face of extensive early damage,” says Greta Tuckute, a graduate student in the Fedorenko lab and first author of the paper.

In an introduction to the study, EG herself puts the social implications of the findings starkly. “Please do not call my brain abnormal, that creeps me out,” she . “My brain is atypical. If not for accidentally finding these differences, no one would pick me out of a crowd as likely to have these, or any other differences that make me unique.”

How we process language

The frontal and temporal lobes are part of the cerebrum, the largest part of the brain. The cerebrum controls many functions, including the five senses, language, working memory, personality, movement, learning, and reasoning. It is divided into two hemispheres, the left and the right, by a deep longitudinal fissure. The two hemispheres communicate via a thick bundle of nerve fibers called the corpus callosum. Each hemisphere comprises four main lobes—frontal, parietal, temporal, and occipital. Core parts of the language network reside in the frontal and temporal lobes.

Core parts of the language network (shown in teal) reside in the left frontal and temporal lobes. Image: Ev Fedorenko

In most individuals, the language system develops in both the right and left hemispheres, with the left side dominant from an early age. The frontal lobe develops slower than the temporal lobe. Together, the interconnected frontal and temporal language areas enable us to understand and produce words, phrases, and sentences.

How, then, did EG, with no left temporal lobe, come to speak, comprehend, and remember verbal information (even a foreign language!) with such proficiency?

Simply put, the right hemisphere took over: “EG has a completely well-functioning neurotypical-like language system in her right hemisphere,” says Tuckute. “It is incredible that a person can use a single hemisphere—and the right hemisphere at that, which in most people is not the dominant hemisphere where language is processed—and be perfectly fine.”

Journey into EG’s brain

In the study, the researchers conducted two scans of EG’s brain using functional magnetic resonance imaging (fMRI), one in 2016 and one in 2019, and had her complete a range of behaviorial tests. fMRI measures the level of blood oxygenation across the brain and can be used to make inferences about where neural activity is taking place. The researchers also scanned the brains of 151 “neurotypical” people. The large number of participants, combined with robust task paradigms and rigorous statistical analyses made it possible to draw conclusions from a single case such as EG.

Magnetic resonance image of EG’s brain showing missing left temporal lobe. Image: Fedorenko Lab

Fedorenko is a staunch advocate of the single case study approach—common in medicine but not currently in neuroscience. “Unusual brains—and unusual individuals more broadly—can provide critical insights into brain organization and function that we simply cannot gain by looking at more typical brains.” Studying individual brains with fMRI, however, requires paradigms that work robustly at the single-brain level. This is not true of most paradigms used in the field, which require averaging many brains together to obtain an effect. Developing individual-level fMRI paradigms for language research has been the focus of Fedorenko’s early work, although the main reason for doing so had nothing to do with studying atypical brains: individual-level analyses are simply better—they are more sensitive and their results are more interpretable and meaningful.

“Looking at high-quality data in an individual participant versus looking at a group-level map is akin to using a high-precision microscope versus looking with a naked myopic eye, when all you see is a blur,” she wrote in an article published in Current Opinion in Behaviorial Sciences in 2021. Having developed and validated such paradigms, though, is now allowing Fedorenko and her group to probe interesting brains.

While in the scanner, each participant performed a task that Fedorenko began developing more than a decade ago. They were presented with a series of words that form real, meaningful sentences, and with a series of “nonwords”—strings of letters that are pronounceable but without meaning. In typical brains, language areas respond more strongly when participants read sentences compared to when they read nonword sequences.

Similarly, in response to the real sentences, the language regions in EG’s right frontal and temporal lobes lit up—they were bursting with activity—while the left frontal lobe regions remained silent. In the neurotypical participants, the language regions in both the left and right frontal and temporal lobes lit up, with the left areas outshining the right.

fMRI showing EG’s language activation on the brain surface. The right frontal lobe shows robust activations, while the left frontal lobe does not have any language responsive areas. Image: Fedorenko lab

“EG showed a very strong response in the right temporal and frontal regions that process language,” says Tuckute. “And if you look at the controls, whose language dominant hemisphere is in the left, EG’s response in her right hemisphere was similar—or even higher—compared to theirs, just on the opposite side.”

Leaving no stone unturned, the researchers next asked whether the lack of language responses in EG’s left frontal lobe might be due to a general lack of response to cognitive tasks rather than just to language. So they conducted a non-language, working-memory task: they had EG and the neurotypical participants perform arithmetic addition problems while in the scanner. In typical brains, this task elicits responses in frontal and parietal areas in both hemisphers.

Not only did regions of EG’s right frontal lobe light up in response to the task, those in her left frontal lobe did, too. “Both EG’s language-dominant (right) hemisphere, and her non-language-dominant (left) hemisphere showed robust responses to this working-memory task ,” says Tuckute. “So, yes, there’s definitely cognitive processing going on there. This selective lack of language responses in EG’s left frontal lobe led us to conclude that, for language, you need the temporal language region to ‘wire up’ the frontal language region.”

Next steps

In science, the answer to one question opens the door to untold more. “In EG, language took over a large chunk of the right frontal and temporal lobes,” says Fedorenko. “So what happens to the functions that in neurotypical individuals generally live in the right hemisphere?”

Many of those, she says, are social functions. The team has already tested EG on social tasks and is currently exploring how those social functions cohabit with the language ones in her right hemisphere. How can they all fit? Do some of the social functions have to migrate to other parts of the brain? They are also working with EG’s family: they have now scanned EG’s three siblings (one of whom is missing most of her right temporal lobe; the other two are neurotypical) and her father (also neurotypical).

The “Interesting Brains Project” website details current projects, findings, and ways to participate.

The project has now grown to include many other individuals with interesting brains, who contacted Fedorenko after some of this work was covered by news outlets. A website for this project can be found here. The project promises to provide unique insights into how our plastic brains reorganize and adapt to various circumstances.

 

New collaboration aims to strengthen orthotic and prosthetic care in Sierra Leone

MIT’s K. Lisa Yang Center for Bionics has entered into a collaboration with the Government of Sierra Leone to strengthen the capabilities and services of that country’s orthotic and prosthetic (O&P) sector. Tens of thousands of people in Sierra Leone are in need of orthotic braces and artificial limbs, but access to such specialized medical care in this African nation is limited.

The agreement, reached between MIT, the Center for Bionics, and Sierra Leone’s Ministry of Health and Sanitation (MoHS), provides a detailed memorandum of understanding and intentions that will begin as a four-year program.  The collaborators aim to strengthen Sierra Leone’s O&P sector through six key objectives: data collection and clinic operations, education, supply chain, infrastructure, new technologies and mobile delivery of services.

Project Objectives

  1. Data Collection and Clinic Operations: collect comprehensive data on epidemiology, need, utilization, and access for O&P services across the country
  2. Education: create an inclusive education and training program for the people of Sierra Leone, to enable sustainable and independent operation of O&P services
  3. Supply Chain: establish supply chains for prosthetic and orthotic components, parts, and materials for fabrication of devices
  4. Infrastructure: prepare infrastructure (e.g., physical space, sufficient water, power and internet) to support increased production and services
  5. New Technologies: develop and translate innovative technologies with potential to improve O&P clinic operations and management, patient mobility, and the design or fabrication of devices
  6. Mobile Delivery: support outreach services and mobile delivery of care for patients in rural and difficult-to-reach areas

Working together, MIT’s bionics center and Sierra Leone’s MoHS aim to sustainably double the production and distribution of O&P services at Sierra Leone’s National Rehabilitation Centre and Bo Clinics over the next four years.

The team of MIT scientists who will be implementing this novel collaboration is led by Hugh Herr, MIT Professor of Media Arts and Sciences. Herr, himself a double amputee, serves as co-director of the K. Lisa Yang Center for Bionics, and heads the renowned Biomechatronics research group at the MIT Media Lab.

“From educational services, to supply chain, to new technology, this important MOU with the government of Sierra Leone will enable the Center to develop a broad, integrative approach to the orthotic and prosthetic sector within Sierra Leone, strengthening services and restoring much needed care to its citizens,” notes Professor Herr.

Sierra Leone’s Honorable Minister of Health Dr. Austin Demby also states: “As the Ministry of Health and Sanitation continues to galvanize efforts towards the attainment of Universal Health Coverage through the life stages approach, this collaboration will foster access, innovation and capacity building in the Orthotic and Prosthetic division. The ministry is pleased to work with and learn from MIT over the next four years in building resilient health systems, especially for vulnerable groups.”

“Our team at MIT brings together expertise across disciplines from global health systems to engineering and design,” added Francesca Riccio-Ackerman, the graduate student lead for the MIT Sierra Leone project. “This allows us to craft an innovative strategy with Sierra Leone’s Ministry of Health and Sanitation. Together we aim to improve available orthotic and prosthetic care for people with disabilities.”

The K. Lisa Yang Center for Bionics at the Massachusetts Institute of Technology pioneers transformational bionic interventions across a broad range of conditions affecting the body and mind. Based on fundamental scientific principles, the Center seeks to develop neural and mechanical interfaces for human-machine communications; integrate these interfaces into novel bionic platforms; perform clinical trials to accelerate the deployment of bionic products by the private sector; and leverage novel and durable, but affordable, materials and manufacturing processes to ensure equitable access to the latest bionic technology by all impacted individuals, especially those in developing countries. 

Sierra Leone’s Ministry of Health and Sanitation is responsible for health service delivery across the country, as well as regulation of the health sector to meet the health needs of its citizenry. 

For more information about this project, please visit: https://mitmedialab.info/prosforallproj2

 

How Huntington’s disease affects different neurons

In patients with Huntington’s disease, neurons in a part of the brain called the striatum are among the hardest-hit. Degeneration of these neurons contributes to patients’ loss of motor control, which is one of the major hallmarks of the disease.

Neuroscientists at MIT have now shown that two distinct cell populations in the striatum are affected differently by Huntington’s disease. They believe that neurodegeneration of one of these populations leads to motor impairments, while damage to the other population, located in structures called striosomes, may account for the mood disorders that are often see in the early stages of the disease.

“As many as 10 years ahead of the motor diagnosis, Huntington’s patients can experience mood disorders, and one possibility is that the striosomes might be involved in these,” says Ann Graybiel, an MIT Institute Professor, a member of MIT’s McGovern Institute for Brain Research, and one of the senior authors of the study.

Using single-cell RNA sequencing to analyze the genes expressed in mouse models of Huntington’s disease and postmortem brain samples from Huntington’s patients, the researchers found that cells of the striosomes and another structure, the matrix, begin to lose their distinguishing features as the disease progresses. The researchers hope that their mapping of the striatum and how it is affected by Huntington’s could help lead to new treatments that target specific cells within the brain.

This kind of analysis could also shed light on other brain disorders that affect the striatum, such as Parkinson’s disease and autism spectrum disorder, the researchers say.

Myriam Heiman, an associate professor in MIT’s Department of Brain and Cognitive Sciences and a member of the Picower Institute for Learning and Memory, and Manolis Kellis, a professor of computer science in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and a member of the Broad Institute of MIT and Harvard, are also senior authors of the study. Ayano Matsushima, a McGovern Institute research scientist, and Sergio Sebastian Pineda, an MIT graduate student, are the lead authors of the paper, which appears in Nature Communications.

Neuron vulnerability

Huntington’s disease leads to degeneration of brain structures called the basal ganglia, which are responsible for control of movement and also play roles in other behaviors, as well as emotions. For many years, Graybiel has been studying the striatum, a part of the basal ganglia that is involved in making decisions that require evaluating the outcomes of a particular action.

Many years ago, Graybiel discovered that the striatum is divided into striosomes, which are clusters of neurons, and the matrix, which surrounds the striosomes. She has also shown that striosomes are necessary for making decisions that require an anxiety-provoking cost-benefit analysis.

In a 2007 study, Richard Faull of the University of Auckland discovered that in postmortem brain tissue from Huntington’s patients, the striosomes showed a great deal of degeneration. Faull also found that while those patients were alive, many of them had shown signs of mood disorders such as depression before their motor symptoms developed.

To further explore the connections between the striatum and the mood and motor effects of Huntington’s, Graybiel teamed up with Kellis and Heiman to study the gene expression patterns of striosomal and matrix cells. To do that, the researchers used single-cell RNA sequencing to analyze human brain samples and brain tissue from two mouse models of Huntington’s disease.

Within the striatum, neurons can be classified as either D1 or D2 neurons. D1 neurons are involved in the “go” pathway, which initiates an action, and D2 neurons are part of the “no-go” pathway, which suppresses an action. D1 and D2 neurons can both be found within either the striosomes and the matrix.

The analysis of RNA expression in each of these types of cells revealed that striosomal neurons are harder hit by Huntington’s than matrix neurons. Furthermore, within the striosomes, D2 neurons are more vulnerable than D1.

The researchers also found that these four major cell types begin to lose their identifying molecular identities and become more difficult to distinguish from one another in Huntington’s disease. “Overall, the distinction between striosomes and matrix becomes really blurry,” Graybiel says.

Striosomal disorders

The findings suggest that damage to the striosomes, which are known to be involved in regulating mood, may be responsible for the mood disorders that strike Huntington’s patients in the early stages of the disease. Later on, degeneration of the matrix neurons likely contributes to the decline of motor function, the researchers say.

In future work, the researchers hope to explore how degeneration or abnormal gene expression in the striosomes may contribute to other brain disorders.

Previous research has shown that overactivity of striosomes can lead to the development of repetitive behaviors such as those seen in autism, obsessive compulsive disorder, and Tourette’s syndrome. In this study, at least one of the genes that the researchers discovered was overexpressed in the striosomes of Huntington’s brains is also linked to autism.

Additionally, many striosome neurons project to the part of the brain that is most affected by Parkinson’s disease (the substantia nigra, which produces most of the brain’s dopamine).

“There are many, many disorders that probably involve the striatum, and now, partly through transcriptomics, we’re working to understand how all of this could fit together,” Graybiel says.

The research was funded by the Saks Kavanaugh Foundation, the CHDI Foundation, the National Institutes of Health, the Nancy Lurie Marks Family Foundation, the Simons Foundation, the JPB Foundation, the Kristin R. Pressman and Jessica J. Pourian ’13 Fund, and Robert Buxton.

Self-assembling proteins can store cellular “memories”

As cells perform their everyday functions, they turn on a variety of genes and cellular pathways. MIT engineers have now coaxed cells to inscribe the history of these events in a long protein chain that can be imaged using a light microscope.

Cells programmed to produce these chains continuously add building blocks that encode particular cellular events. Later, the ordered protein chains can be labeled with fluorescent molecules and read under a microscope, allowing researchers to reconstruct the timing of the events.

This technique could help shed light on the steps that underlie processes such as memory formation, response to drug treatment, and gene expression.

“There are a lot of changes that happen at organ or body scale, over hours to weeks, which cannot be tracked over time,” says Edward Boyden, the Y. Eva Tan Professor in Neurotechnology, a professor of biological engineering and brain and cognitive sciences at MIT, a Howard Hughes Medical Institute investigator, and a member of MIT’s McGovern Institute for Brain Research and Koch Institute for Integrative Cancer Research.

If the technique could be extended to work over longer time periods, it could also be used to study processes such as aging and disease progression, the researchers say.

Boyden is the senior author of the study, which appears today in Nature Biotechnology. Changyang Linghu, a former J. Douglas Tan Postdoctoral Fellow at the McGovern Institute, who is now an assistant professor at the University of Michigan, is the lead author of the paper.

Cellular history

Biological systems such as organs contain many different kinds of cells, all of which have distinctive functions. One way to study these functions is to image proteins, RNA, or other molecules inside the cells, which provide hints to what the cells are doing. However, most methods for doing this offer only a glimpse of a single moment in time, or don’t work well with very large populations of cells.

“Biological systems are often composed of a large number of different types of cells. For example, the human brain has 86 billion cells,” Linghu says. “To understand those kinds of biological systems, we need to observe physiological events over time in these large cell populations.”

To achieve that, the research team came up with the idea of recording cellular events as a series of protein subunits that are continuously added to a chain. To create their chains, the researchers used engineered protein subunits, not normally found in living cells, that can self-assemble into long filaments.

The researchers designed a genetically encoded system in which one of these subunits is continuously produced inside cells, while the other is generated only when a specific event occurs. Each subunit also contains a very short peptide called an epitope tag — in this case, the researchers chose tags called HA and V5. Each of these tags can bind to a different fluorescent antibody, making it easy to visualize the tags later on and determine the sequence of the protein subunits.

For this study, the researchers made production of the V5-containing subunit contingent on the activation of a gene called c-fos, which is involved in encoding new memories. HA-tagged subunits make up most of the chain, but whenever the V5 tag shows up in the chain, that means that c-fos was activated during that time.

“We’re hoping to use this kind of protein self-assembly to record activity in every single cell,” Linghu says. “It’s not only a snapshot in time, but also records past history, just like how tree rings can permanently store information over time as the wood grows.”

Recording events

In this study, the researchers first used their system to record activation of c-fos in neurons growing in a lab dish. The c-fos gene was activated by chemically induced activation of the neurons, which caused the V5 subunit to be added to the protein chain.

To explore whether this approach could work in the brains of animals, the researchers programmed brain cells of mice to generate protein chains that would reveal when the animals were exposed to a particular drug. Later, the researchers were able to detect that exposure by preserving the tissue and analyzing it with a light microscope.

The researchers designed their system to be modular, so that different epitope tags can be swapped in, or different types of cellular events can be detected, including, in principle, cell division or activation of enzymes called protein kinases, which help control many cellular pathways.

The researchers also hope to extend the recording period that they can achieve. In this study, they recorded events for several days before imaging the tissue. There is a tradeoff between the amount of time that can be recorded and the time resolution, or frequency of event recording, because the length of the protein chain is limited by the size of the cell.

“The total amount of information it could store is fixed, but we could in principle slow down or increase the speed of the growth of the chain,” Linghu says. “If we want to record for a longer time, we could slow down the synthesis so that it will reach the size of the cell within, let’s say two weeks. In that way we could record longer, but with less time resolution.”

The researchers are also working on engineering the system so that it can record multiple types of events in the same chain, by increasing the number of different subunits that can be incorporated.

The research was funded by the Hock E. Tan and K. Lisa Yang Center for Autism Research, John Doerr, the National Institutes of Health, the National Science Foundation, the U.S. Army Research Office, and the Howard Hughes Medical Institute.

New sensor uses MRI to detect light deep in the brain

Using a specialized MRI sensor, MIT researchers have shown that they can detect light deep within tissues such as the brain.

Imaging light in deep tissues is extremely difficult because as light travels into tissue, much of it is either absorbed or scattered. The MIT team overcame that obstacle by designing a sensor that converts light into a magnetic signal that can be detected by MRI (magnetic resonance imaging).

This type of sensor could be used to map light emitted by optical fibers implanted in the brain, such as the fibers used to stimulate neurons during optogenetic experiments. With further development, it could also prove useful for monitoring patients who receive light-based therapies for cancer, the researchers say.

“We can image the distribution of light in tissue, and that’s important because people who use light to stimulate tissue or to measure from tissue often don’t quite know where the light is going, where they’re stimulating, or where the light is coming from. Our tool can be used to address those unknowns,” says Alan Jasanoff, an MIT professor of biological engineering, brain and cognitive sciences, and nuclear science and engineering.

Jasanoff, who is also an associate investigator at MIT’s McGovern Institute for Brain Research, is the senior author of the study, which appears today in Nature Biomedical Engineering. Jacob Simon PhD ’21 and MIT postdoc Miriam Schwalm are the paper’s lead authors, and Johannes Morstein and Dirk Trauner of New York University are also authors of the paper.

A light-sensitive probe

Scientists have been using light to study living cells for hundreds of years, dating back to the late 1500s, when the light microscope was invented. This kind of microscopy allows researchers to peer inside cells and thin slices of tissue, but not deep inside an organism.

“One of the persistent problems in using light, especially in the life sciences, is that it doesn’t do a very good job penetrating many materials,” Jasanoff says. “Biological materials absorb light and scatter light, and the combination of those things prevents us from using most types of optical imaging for anything that involves focusing in deep tissue.”

To overcome that limitation, Jasanoff and his students decided to design a sensor that could transform light into a magnetic signal.

“We wanted to create a magnetic sensor that responds to light locally, and therefore is not subject to absorbance or scattering. Then this light detector can be imaged using MRI,” he says.

Jasanoff’s lab has previously developed MRI probes that can interact with a variety of molecules in the brain, including dopamine and calcium. When these probes bind to their targets, it affects the sensors’ magnetic interactions with the surrounding tissue, dimming or brightening the MRI signal.

To make a light-sensitive MRI probe, the researchers decided to encase magnetic particles in a nanoparticle called a liposome. The liposomes used in this study are made from specialized light-sensitive lipids that Trauner had previously developed. When these lipids are exposed to a certain wavelength of light, the liposomes become more permeable to water, or “leaky.” This allows the magnetic particles inside to interact with water and generate a signal detectable by MRI.

The particles, which the researchers called liposomal nanoparticle reporters (LisNR), can switch from permeable to impermeable depending on the type of light they’re exposed to. In this study, the researchers created particles that become leaky when exposed to ultraviolet light, and then become impermeable again when exposed to blue light. The researchers also showed that the particles could respond to other wavelengths of light.

“This paper shows a novel sensor to enable photon detection with MRI through the brain. This illuminating work introduces a new avenue to bridge photon and proton-driven neuroimaging studies,” says Xin Yu, an assistant professor radiology at Harvard Medical School, who was not involved in the study.

Mapping light

The researchers tested the sensors in the brains of rats — specifically, in a part of the brain called the striatum, which is involved in planning movement and responding to reward. After injecting the particles throughout the striatum, the researchers were able to map the distribution of light from an optical fiber implanted nearby.

The fiber they used is similar to those used for optogenetic stimulation, so this kind of sensing could be useful to researchers who perform optogenetic experiments in the brain, Jasanoff says.

“We don’t expect that everybody doing optogenetics will use this for every experiment — it’s more something that you would do once in a while, to see whether a paradigm that you’re using is really producing the profile of light that you think it should be,” Jasanoff says.

In the future, this type of sensor could also be useful for monitoring patients receiving treatments that involve light, such as photodynamic therapy, which uses light from a laser or LED to kill cancer cells.

The researchers are now working on similar probes that could be used to detect light emitted by luciferases, a family of glowing proteins that are often used in biological experiments. These proteins can be used to reveal whether a particular gene is activated or not, but currently they can only be imaged in superficial tissue or cells grown in a lab dish.

Jasanoff also hopes to use the strategy used for the LisNR sensor to design MRI probes that can detect stimuli other than light, such as neurochemicals or other molecules found in the brain.

“We think that the principle that we use to construct these sensors is quite broad and can be used for other purposes too,” he says.

The research was funded by the National Institutes of Health, the G. Harold and Leila Y. Mathers Foundation, a Friends of the McGovern Fellowship from the McGovern Institute for Brain Research, the MIT Neurobiological Engineering Training Program, and a Marie Curie Individual Fellowship from the European Commission.

This is your brain. This is your brain on code

Functional magnetic resonance imaging (fMRI), which measures changes in blood flow throughout the brain, has been used over the past couple of decades for a variety of applications, including “functional anatomy” — a way of determining which brain areas are switched on when a person carries out a particular task. fMRI has been used to look at people’s brains while they’re doing all sorts of things — working out math problems, learning foreign languages, playing chess, improvising on the piano, doing crossword puzzles, and even watching TV shows like “Curb Your Enthusiasm.”

One pursuit that’s received little attention is computer programming — both the chore of writing code and the equally confounding task of trying to understand a piece of already-written code. “Given the importance that computer programs have assumed in our everyday lives,” says Shashank Srikant, a PhD student in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), “that’s surely worth looking into. So many people are dealing with code these days — reading, writing, designing, debugging — but no one really knows what’s going on in their heads when that happens.” Fortunately, he has made some “headway” in that direction in a paper — written with MIT colleagues Benjamin Lipkin (the paper’s other lead author, along with Srikant), Anna Ivanova, Evelina Fedorenko, and Una-May O’Reilly — that was presented earlier this month at the Neural Information Processing Systems Conference held in New Orleans.

The new paper built on a 2020 study, written by many of the same authors, which used fMRI to monitor the brains of programmers as they “comprehended” small pieces, or snippets, of code. (Comprehension, in this case, means looking at a snippet and correctly determining the result of the computation performed by the snippet.) The 2020 work showed that code comprehension did not consistently activate the language system, brain regions that handle language processing, explains Fedorenko, a brain and cognitive sciences (BCS) professor and a coauthor of the earlier study. “Instead, the multiple demand network — a brain system that is linked to general reasoning and supports domains like mathematical and logical thinking — was strongly active.” The current work, which also utilizes MRI scans of programmers, takes “a deeper dive,” she says, seeking to obtain more fine-grained information.

Whereas the previous study looked at 20 to 30 people to determine which brain systems, on average, are relied upon to comprehend code, the new research looks at the brain activity of individual programmers as they process specific elements of a computer program. Suppose, for instance, that there’s a one-line piece of code that involves word manipulation and a separate piece of code that entails a mathematical operation. “Can I go from the activity we see in the brains, the actual brain signals, to try to reverse-engineer and figure out what, specifically, the programmer was looking at?” Srikant asks. “This would reveal what information pertaining to programs is uniquely encoded in our brains.” To neuroscientists, he notes, a physical property is considered “encoded” if they can infer that property by looking at someone’s brain signals.

Take, for instance, a loop — an instruction within a program to repeat a specific operation until the desired result is achieved — or a branch, a different type of programming instruction than can cause the computer to switch from one operation to another. Based on the patterns of brain activity that were observed, the group could tell whether someone was evaluating a piece of code involving a loop or a branch. The researchers could also tell whether the code related to words or mathematical symbols, and whether someone was reading actual code or merely a written description of that code.

That addressed a first question that an investigator might ask as to whether something is, in fact, encoded. If the answer is yes, the next question might be: where is it encoded? In the above-cited cases — loops or branches, words or math, code or a description thereof — brain activation levels were found to be comparable in both the language system and the multiple demand network.

A noticeable difference was observed, however, when it came to code properties related to what’s called dynamic analysis.

Programs can have “static” properties — such as the number of numerals in a sequence — that do not change over time. “But programs can also have a dynamic aspect, such as the number of times a loop runs,” Srikant says. “I can’t always read a piece of code and know, in advance, what the run time of that program will be.” The MIT researchers found that for dynamic analysis, information is encoded much better in the multiple demand network than it is in the language processing center. That finding was one clue in their quest to see how code comprehension is distributed throughout the brain — which parts are involved and which ones assume a bigger role in certain aspects of that task.

The team carried out a second set of experiments, which incorporated machine learning models called neural networks that were specifically trained on computer programs. These models have been successful, in recent years, in helping programmers complete pieces of code. What the group wanted to find out was whether the brain signals seen in their study when participants were examining pieces of code resembled the patterns of activation observed when neural networks analyzed the same piece of code. And the answer they arrived at was a qualified yes.

“If you put a piece of code into the neural network, it produces a list of numbers that tells you, in some way, what the program is all about,” Srikant says. Brain scans of people studying computer programs similarly produce a list of numbers. When a program is dominated by branching, for example, “you see a distinct pattern of brain activity,” he adds, “and you see a similar pattern when the machine learning model tries to understand that same snippet.”

Mariya Toneva of the Max Planck Institute for Software Systems considers findings like this “particularly exciting. They raise the possibility of using computational models of code to better understand what happens in our brains as we read programs,” she says.

The MIT scientists are definitely intrigued by the connections they’ve uncovered, which shed light on how discrete pieces of computer programs are encoded in the brain. But they don’t yet know what these recently-gleaned insights can tell us about how people carry out more elaborate plans in the real world. Completing tasks of this sort — such as going to the movies, which requires checking showtimes, arranging for transportation, purchasing tickets, and so forth — could not be handled by a single unit of code and just a single algorithm. Successful execution of such a plan would instead require “composition” — stringing together various snippets and algorithms into a sensible sequence that leads to something new, just like assembling individual bars of music in order to make a song or even a symphony. Creating models of code composition, says O’Reilly, a principal research scientist at CSAIL, “is beyond our grasp at the moment.”

Lipkin, a BCS PhD student, considers this the next logical step — figuring out how to “combine simple operations to build complex programs and use those strategies to effectively address general reasoning tasks.” He further believes that some of the progress toward that goal achieved by the team so far owes to its interdisciplinary makeup. “We were able to draw from individual experiences with program analysis and neural signal processing, as well as combined work on machine learning and natural language processing,” Lipkin says. “These types of collaborations are becoming increasingly common as neuro- and computer scientists join forces on the quest towards understanding and building general intelligence.”

This project was funded by grants from the MIT-IBM Watson AI lab, MIT Quest Initiative, National Science Foundation, National Institutes of Health, McGovern Institute of Brain Research, MIT Department of Brain and Cognitive Sciences, and the Simons Center for the Social Brain.

Season’s Greetings from the McGovern Institute

This year’s holiday video (shown above) was inspired by Ev Fedorenko’s July 2022 Nature Neuroscience paper, which found similar patterns of brain activation and language selectivity across speakers of 45 different languages.

Universal language network

Ev Fedorenko uses the widely translated book “Alice in Wonderland” to test brain responses to different languages. Photo: Caitlin Cunningham

Over several decades, neuroscientists have created a well-defined map of the brain’s “language network,” or the regions of the brain that are specialized for processing language. Found primarily in the left hemisphere, this network includes regions within Broca’s area, as well as in other parts of the frontal and temporal lobes. Although roughly 7,000 languages are currently spoken and signed across the globe, the vast majority of those mapping studies have been done in English speakers as they listened to or read English texts.

To truly understand the cognitive and neural mechanisms that allow us to learn and process such diverse languages, Fedorenko and her team scanned the brains of speakers of 45 different languages while they listened to Alice in Wonderland in their native language. The results show that the speakers’ language networks appear to be essentially the same as those of native English speakers — which suggests that the location and key properties of the language network appear to be universal.

The many languages of McGovern

English may be the primary language used by McGovern researchers, but more than 35 other languages are spoken by scientists and engineers at the McGovern Institute. Our holiday video features 30 of these researchers saying Happy New Year in their native (or learned) language. Below is the complete list of languages included in our video. Expand each accordion to learn more about the speaker of that particular language and the meaning behind their new year’s greeting.

Brains on conlangs

For a few days in November, the McGovern Institute hummed with invented languages. Strangers greeted one another in Esperanto; trivia games were played in High Valyrian; Klingon and Na’vi were heard inside MRI scanners. Creators and users of these constructed languages (conlangs) had gathered at MIT in the name of neuroscience. McGovern Institute investigator Evelina Fedorenko and her team wanted to know what happened in their brains when they heard and understood these “foreign” tongues.

The constructed languages spoken by attendees had all been created for specific purposes. Most, like the Na’vi language spoken in the movie Avatar, had given identity and voice to the inhabitants of fictional worlds, while Esperanto was created to reduce barriers to international communication. But despite their distinct origins, a familiar pattern of activity emerged when researchers scanned speakers’ brains. The brain, they found, processes constructed languages with the same network of areas it uses for languages that evolved naturally over millions of years.

The meaning of language

“There’s all these things that people call language,” Fedorenko says. “Music is a kind of language and math is a kind of language.” But the brain processes these metaphorical languages differently than it does the languages humans use to communicate broadly about the world. To neuroscientists like Fedorenko, they can’t legitimately be considered languages at all. In contrast, she says, “these constructed languages seem really quite like natural languages.”

The “Brains on Conlangs” event that Fedorenko’s team hosted was part of its ongoing effort to understand the way language is generated and understood by the brain. Her lab and others have identified specific brain regions involved in linguistic processing, but it’s not yet clear how universal the language network is. Most studies of language cognition have focused on languages widely spoken in well-resourced parts of the world—primarily English, German, and Dutch. There are thousands of languages—spoken or signed—that have not been included.

Brain activation in a Klingon speaker while listening to English (left) and Klingon (right). Image: Saima Malik Moraleda

Fedorenko and her team are deliberately taking a broader approach. “If we’re making claims about language as a whole, it’s kind of weird to make it based on a handful of languages,” she says. “So we’re trying to create tools and collect some data on as many languages as possible.”

So far, they have found that the language networks used by native speakers of dozens of different languages do share key architectural similarities. And by including a more diverse set of languages in their research, Fedorenko and her team can begin to explore how the brain makes sense of linguistic features that are not part of English or other well studied languages. The Brains on Conlangs event was a chance to expand their studies even further.

Connecting conlangs

Nearly 50 speakers of Esperanto, Klingon, High Valyrian, Dothraki, and Na’vi attended Brains on Conlangs, drawn by the opportunity to connect with other speakers, hear from language creators, and contribute to the science. Graduate student Saima Malik-Moraleda and postbac research assistant Maya Taliaferro, along with other members of both the Fedorenko lab and brain and cognitive sciences professor Ted Gibson’s lab, and with help from Steve Shannon, Operations Manager of the Martinos Imaging Center, worked tirelessly to collect data from all participants. Two MRI scanners ran nearly continuously as speakers listened to passages in their chosen languages and researchers captured images of the brain’s response. To enable the research team to find the language-specific network in each person’s brain, participants also performed other tasks inside the scanner, including a memory task and listening to muffled audio in which the constructed languages were spoken, but unintelligible. They performed language tasks in English, as well.

To understand how the brain processes constructed languages (conlangs), McGovern Investigator Ev Fedorenko (center) gathered with conlang creators/speakers Marc Okrand (Klingon), Paul Frommer (Na’vi), Damian Blasi, Jessie Sams (méníshè), David Peterson (High Valyrian and Dothraki) and Aroka Okrent at the McGovern Institute for the “Brains on Colangs” event in November 2022. Photo: Elise Malvicini

Prior to the study, Fedorenko says, she had suspected constructed languages would activate the brain’s natural language-processing network, but she couldn’t be sure. Another possibility was that languages like Klingon and Esperanto would be handled instead by a problem-solving network known to be used when people work with some other so-called “languages,” like mathematics or computer programming. But once the data was in, the answer was clear. The five constructed languages included in the study all activated the brain’s language network.

That makes sense, Fedorenko says, because like natural languages, constructed languages enable people to communicate by associating words or signs with objects and ideas. Any language is essentially a way of mapping forms to meanings, she says. “You can construe it as a set of memories of how a particular sequence of sounds corresponds to some meaning. You’re learning meanings of words and constructions, and how to put them together to get more complex meanings. And it seems like the brain’s language system is very well suited for that set of computations.”