This is your brain. This is your brain on code

Functional magnetic resonance imaging (fMRI), which measures changes in blood flow throughout the brain, has been used over the past couple of decades for a variety of applications, including “functional anatomy” — a way of determining which brain areas are switched on when a person carries out a particular task. fMRI has been used to look at people’s brains while they’re doing all sorts of things — working out math problems, learning foreign languages, playing chess, improvising on the piano, doing crossword puzzles, and even watching TV shows like “Curb Your Enthusiasm.”

One pursuit that’s received little attention is computer programming — both the chore of writing code and the equally confounding task of trying to understand a piece of already-written code. “Given the importance that computer programs have assumed in our everyday lives,” says Shashank Srikant, a PhD student in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), “that’s surely worth looking into. So many people are dealing with code these days — reading, writing, designing, debugging — but no one really knows what’s going on in their heads when that happens.” Fortunately, he has made some “headway” in that direction in a paper — written with MIT colleagues Benjamin Lipkin (the paper’s other lead author, along with Srikant), Anna Ivanova, Evelina Fedorenko, and Una-May O’Reilly — that was presented earlier this month at the Neural Information Processing Systems Conference held in New Orleans.

The new paper built on a 2020 study, written by many of the same authors, which used fMRI to monitor the brains of programmers as they “comprehended” small pieces, or snippets, of code. (Comprehension, in this case, means looking at a snippet and correctly determining the result of the computation performed by the snippet.) The 2020 work showed that code comprehension did not consistently activate the language system, brain regions that handle language processing, explains Fedorenko, a brain and cognitive sciences (BCS) professor and a coauthor of the earlier study. “Instead, the multiple demand network — a brain system that is linked to general reasoning and supports domains like mathematical and logical thinking — was strongly active.” The current work, which also utilizes MRI scans of programmers, takes “a deeper dive,” she says, seeking to obtain more fine-grained information.

Whereas the previous study looked at 20 to 30 people to determine which brain systems, on average, are relied upon to comprehend code, the new research looks at the brain activity of individual programmers as they process specific elements of a computer program. Suppose, for instance, that there’s a one-line piece of code that involves word manipulation and a separate piece of code that entails a mathematical operation. “Can I go from the activity we see in the brains, the actual brain signals, to try to reverse-engineer and figure out what, specifically, the programmer was looking at?” Srikant asks. “This would reveal what information pertaining to programs is uniquely encoded in our brains.” To neuroscientists, he notes, a physical property is considered “encoded” if they can infer that property by looking at someone’s brain signals.

Take, for instance, a loop — an instruction within a program to repeat a specific operation until the desired result is achieved — or a branch, a different type of programming instruction than can cause the computer to switch from one operation to another. Based on the patterns of brain activity that were observed, the group could tell whether someone was evaluating a piece of code involving a loop or a branch. The researchers could also tell whether the code related to words or mathematical symbols, and whether someone was reading actual code or merely a written description of that code.

That addressed a first question that an investigator might ask as to whether something is, in fact, encoded. If the answer is yes, the next question might be: where is it encoded? In the above-cited cases — loops or branches, words or math, code or a description thereof — brain activation levels were found to be comparable in both the language system and the multiple demand network.

A noticeable difference was observed, however, when it came to code properties related to what’s called dynamic analysis.

Programs can have “static” properties — such as the number of numerals in a sequence — that do not change over time. “But programs can also have a dynamic aspect, such as the number of times a loop runs,” Srikant says. “I can’t always read a piece of code and know, in advance, what the run time of that program will be.” The MIT researchers found that for dynamic analysis, information is encoded much better in the multiple demand network than it is in the language processing center. That finding was one clue in their quest to see how code comprehension is distributed throughout the brain — which parts are involved and which ones assume a bigger role in certain aspects of that task.

The team carried out a second set of experiments, which incorporated machine learning models called neural networks that were specifically trained on computer programs. These models have been successful, in recent years, in helping programmers complete pieces of code. What the group wanted to find out was whether the brain signals seen in their study when participants were examining pieces of code resembled the patterns of activation observed when neural networks analyzed the same piece of code. And the answer they arrived at was a qualified yes.

“If you put a piece of code into the neural network, it produces a list of numbers that tells you, in some way, what the program is all about,” Srikant says. Brain scans of people studying computer programs similarly produce a list of numbers. When a program is dominated by branching, for example, “you see a distinct pattern of brain activity,” he adds, “and you see a similar pattern when the machine learning model tries to understand that same snippet.”

Mariya Toneva of the Max Planck Institute for Software Systems considers findings like this “particularly exciting. They raise the possibility of using computational models of code to better understand what happens in our brains as we read programs,” she says.

The MIT scientists are definitely intrigued by the connections they’ve uncovered, which shed light on how discrete pieces of computer programs are encoded in the brain. But they don’t yet know what these recently-gleaned insights can tell us about how people carry out more elaborate plans in the real world. Completing tasks of this sort — such as going to the movies, which requires checking showtimes, arranging for transportation, purchasing tickets, and so forth — could not be handled by a single unit of code and just a single algorithm. Successful execution of such a plan would instead require “composition” — stringing together various snippets and algorithms into a sensible sequence that leads to something new, just like assembling individual bars of music in order to make a song or even a symphony. Creating models of code composition, says O’Reilly, a principal research scientist at CSAIL, “is beyond our grasp at the moment.”

Lipkin, a BCS PhD student, considers this the next logical step — figuring out how to “combine simple operations to build complex programs and use those strategies to effectively address general reasoning tasks.” He further believes that some of the progress toward that goal achieved by the team so far owes to its interdisciplinary makeup. “We were able to draw from individual experiences with program analysis and neural signal processing, as well as combined work on machine learning and natural language processing,” Lipkin says. “These types of collaborations are becoming increasingly common as neuro- and computer scientists join forces on the quest towards understanding and building general intelligence.”

This project was funded by grants from the MIT-IBM Watson AI lab, MIT Quest Initiative, National Science Foundation, National Institutes of Health, McGovern Institute of Brain Research, MIT Department of Brain and Cognitive Sciences, and the Simons Center for the Social Brain.

Brains on conlangs

For a few days in November, the McGovern Institute hummed with invented languages. Strangers greeted one another in Esperanto; trivia games were played in High Valyrian; Klingon and Na’vi were heard inside MRI scanners. Creators and users of these constructed languages (conlangs) had gathered at MIT in the name of neuroscience. McGovern Institute investigator Evelina Fedorenko and her team wanted to know what happened in their brains when they heard and understood these “foreign” tongues.

The constructed languages spoken by attendees had all been created for specific purposes. Most, like the Na’vi language spoken in the movie Avatar, had given identity and voice to the inhabitants of fictional worlds, while Esperanto was created to reduce barriers to international communication. But despite their distinct origins, a familiar pattern of activity emerged when researchers scanned speakers’ brains. The brain, they found, processes constructed languages with the same network of areas it uses for languages that evolved naturally over millions of years.

The meaning of language

“There’s all these things that people call language,” Fedorenko says. “Music is a kind of language and math is a kind of language.” But the brain processes these metaphorical languages differently than it does the languages humans use to communicate broadly about the world. To neuroscientists like Fedorenko, they can’t legitimately be considered languages at all. In contrast, she says, “these constructed languages seem really quite like natural languages.”

The “Brains on Conlangs” event that Fedorenko’s team hosted was part of its ongoing effort to understand the way language is generated and understood by the brain. Her lab and others have identified specific brain regions involved in linguistic processing, but it’s not yet clear how universal the language network is. Most studies of language cognition have focused on languages widely spoken in well-resourced parts of the world—primarily English, German, and Dutch. There are thousands of languages—spoken or signed—that have not been included.

Brain activation in a Klingon speaker while listening to English (left) and Klingon (right). Image: Saima Malik Moraleda

Fedorenko and her team are deliberately taking a broader approach. “If we’re making claims about language as a whole, it’s kind of weird to make it based on a handful of languages,” she says. “So we’re trying to create tools and collect some data on as many languages as possible.”

So far, they have found that the language networks used by native speakers of dozens of different languages do share key architectural similarities. And by including a more diverse set of languages in their research, Fedorenko and her team can begin to explore how the brain makes sense of linguistic features that are not part of English or other well studied languages. The Brains on Conlangs event was a chance to expand their studies even further.

Connecting conlangs

Nearly 50 speakers of Esperanto, Klingon, High Valyrian, Dothraki, and Na’vi attended Brains on Conlangs, drawn by the opportunity to connect with other speakers, hear from language creators, and contribute to the science. Graduate student Saima Malik-Moraleda and postbac research assistant Maya Taliaferro, along with other members of both the Fedorenko lab and brain and cognitive sciences professor Ted Gibson’s lab, and with help from Steve Shannon, Operations Manager of the Martinos Imaging Center, worked tirelessly to collect data from all participants. Two MRI scanners ran nearly continuously as speakers listened to passages in their chosen languages and researchers captured images of the brain’s response. To enable the research team to find the language-specific network in each person’s brain, participants also performed other tasks inside the scanner, including a memory task and listening to muffled audio in which the constructed languages were spoken, but unintelligible. They performed language tasks in English, as well.

To understand how the brain processes constructed languages (conlangs), McGovern Investigator Ev Fedorenko (center) gathered with conlang creators/speakers Marc Okrand (Klingon), Paul Frommer (Na’vi), Damian Blasi, Jessie Sams (méníshè), David Peterson (High Valyrian and Dothraki) and Aroka Okrent at the McGovern Institute for the “Brains on Colangs” event in November 2022. Photo: Elise Malvicini

Prior to the study, Fedorenko says, she had suspected constructed languages would activate the brain’s natural language-processing network, but she couldn’t be sure. Another possibility was that languages like Klingon and Esperanto would be handled instead by a problem-solving network known to be used when people work with some other so-called “languages,” like mathematics or computer programming. But once the data was in, the answer was clear. The five constructed languages included in the study all activated the brain’s language network.

That makes sense, Fedorenko says, because like natural languages, constructed languages enable people to communicate by associating words or signs with objects and ideas. Any language is essentially a way of mapping forms to meanings, she says. “You can construe it as a set of memories of how a particular sequence of sounds corresponds to some meaning. You’re learning meanings of words and constructions, and how to put them together to get more complex meanings. And it seems like the brain’s language system is very well suited for that set of computations.”

Whether speaking Turkish or Norwegian, the brain’s language network looks the same

Over several decades, neuroscientists have created a well-defined map of the brain’s “language network,” or the regions of the brain that are specialized for processing language. Found primarily in the left hemisphere, this network includes regions within Broca’s area, as well as in other parts of the frontal and temporal lobes.

However, the vast majority of those mapping studies have been done in English speakers as they listened to or read English texts. MIT neuroscientists have now performed brain imaging studies of speakers of 45 different languages. The results show that the speakers’ language networks appear to be essentially the same as those of native English speakers.

The findings, while not surprising, establish that the location and key properties of the language network appear to be universal. The work also lays the groundwork for future studies of linguistic elements that would be difficult or impossible to study in English speakers because English doesn’t have those features.

“This study is very foundational, extending some findings from English to a broad range of languages,” says Evelina Fedorenko, the Frederick A. and Carole J. Middleton Career Development Associate Professor of Neuroscience at MIT and a member of MIT’s McGovern Institute for Brain Research. “The hope is that now that we see that the basic properties seem to be general across languages, we can ask about potential differences between languages and language families in how they are implemented in the brain, and we can study phenomena that don’t really exist in English.”

Fedorenko is the senior author of the study, which appears today in Nature Neuroscience. Saima Malik-Moraleda, a PhD student in the Speech and Hearing Bioscience and Technology program at Harvard University, and Dima Ayyash, a former research assistant, are the lead authors of the paper.

Mapping language networks

The precise locations and shapes of language areas differ across individuals, so to find the language network, researchers ask each person to perform a language task while scanning their brains with functional magnetic resonance imaging (fMRI). Listening to or reading sentences in one’s native language should activate the language network. To distinguish this network from other brain regions, researchers also ask participants to perform tasks that should not activate it, such as listening to an unfamiliar language or solving math problems.

Several years ago, Fedorenko began designing these “localizer” tasks for speakers of languages other than English. While most studies of the language network have used English speakers as subjects, English does not include many features commonly seen in other languages. For example, in English, word order tends to be fixed, while in other languages there is more flexibility in how words are ordered. Many of those languages instead use the addition of morphemes, or segments of words, to convey additional meaning and relationships between words.

“There has been growing awareness for many years of the need to look at more languages, if you want make claims about how language works, as opposed to how English works,” Fedorenko says. “We thought it would be useful to develop tools to allow people to rigorously study language processing in the brain in other parts of the world. There’s now access to brain imaging technologies in many countries, but the basic paradigms that you would need to find the language-responsive areas in a person are just not there.”

For the new study, the researchers performed brain imaging of two speakers of 45 different languages, representing 12 different language families. Their goal was to see if key properties of the language network, such as location, left lateralization, and selectivity, were the same in those participants as in people whose native language is English.

The researchers decided to use “Alice in Wonderland” as the text that everyone would listen to, because it is one of the most widely translated works of fiction in the world. They selected 24 short passages and three long passages, each of which was recorded by a native speaker of the language. Each participant also heard nonsensical passages, which should not activate the language network, and was asked to do a variety of other cognitive tasks that should not activate it.

The team found that the language networks of participants in this study were found in approximately the same brain regions, and had the same selectivity, as those of native speakers of English.

“Language areas are selective,” Malik-Moraleda says. “They shouldn’t be responding during other tasks such as a spatial working memory task, and that was what we found across the speakers of 45 languages that we tested.”

Additionally, language regions that are typically activated together in English speakers, such as the frontal language areas and temporal language areas, were similarly synchronized in speakers of other languages.

The researchers also showed that among all of the subjects, the small amount of variation they saw between individuals who speak different languages was the same as the amount of variation that would typically be seen between native English speakers.

Similarities and differences

While the findings suggest that the overall architecture of the language network is similar across speakers of different languages, that doesn’t mean that there are no differences at all, Fedorenko says. As one example, researchers could now look for differences in speakers of languages that predominantly use morphemes, rather than word order, to help determine the meaning of a sentence.

“There are all sorts of interesting questions you can ask about morphological processing that don’t really make sense to ask in English, because it has much less morphology,” Fedorenko says.

Another possibility is studying whether speakers of languages that use differences in tone to convey different word meanings would have a language network with stronger links to auditory brain regions that encode pitch.

Right now, Fedorenko’s lab is working on a study in which they are comparing the ‘temporal receptive fields’ of speakers of six typologically different languages, including Turkish, Mandarin, and Finnish. The temporal receptive field is a measure of how many words the language processing system can handle at a time, and for English, it has been shown to be six to eight words long.

“The language system seems to be working on chunks of just a few words long, and we’re trying to see if this constraint is universal across these other languages that we’re testing,” Fedorenko says.

The researchers are also working on creating language localizer tasks and finding study participants representing additional languages beyond the 45 from this study.

The research was funded by the National Institutes of Health and research funds from MIT’s Department of Brain and Cognitive Sciences, the McGovern Institute, and the Simons Center for the Social Brain. Malik-Moraleda was funded by a la Caixa Fellowship and a Friends of McGovern fellowship.

What words can convey

From search engines to voice assistants, computers are getting better at understanding what we mean. That’s thanks to language processing programs that make sense of a staggering number of words, without ever being told explicitly what those words mean. Such programs infer meaning instead through statistics—and a new study reveals that this computational approach can assign many kinds of information to a single word, just like the human brain.

The study, published April 14, 2022, in the journal Nature Human Behavior, was co-led by Gabriel Grand, a graduate student at MIT’s Computer Science and Artificial Intelligence Laboratory, and Idan Blank, an assistant professor at the University of California, Los Angeles, and supervised by McGovern Investigator Ev Fedorenko, a cognitive neuroscientist who studies how the human brain uses and understands language, and Francisco Pereira at the National Institute of Mental Health. Fedorenko says the rich knowledge her team was able to find within computational language models demonstrates just how much can be learned about the world through language alone.

Early language models

The research team began its analysis of statistics-based language processing models in 2015, when the approach was new. Such models derive meaning by analyzing how often pairs of words co-occur in texts and using those relationships to assess the similarities of words’ meanings. For example, such a program might conclude that “bread” and “apple” are more similar to one another than they are to “notebook,” because “bread” and “apple” are often found in proximity to words like “eat” or “snack,” whereas “notebook” is not.

The models were clearly good at measuring words’ overall similarity to one another. But most words carry many kinds of information, and their similarities depend on which qualities are being evaluated. “Humans can come up with all these different mental scales to help organize their understanding of words,” explains Grand, a former undergraduate researcher in the Fedorenko lab. For examples, he says, “dolphins and alligators might be similar in size, but one is much more dangerous than the other.”

Grand and Idan Blank, who was then a graduate student at the McGovern Institute, wanted to know whether the models captured that same nuance. And if they did, how was the information organized?

To learn how the information in such a model stacked up to humans’ understanding of words, the team first asked human volunteers to score words along many different scales: Were the concepts those words conveyed big or small, safe or dangerous, wet or dry? Then, having mapped where people position different words along these scales, they looked to see whether language processing models did the same.

Grand explains that distributional semantic models use co-occurrence statistics to organize words into a huge, multidimensional matrix. The more similar words are to one another, the closer they are within that space. The dimensions of the space are vast, and there is no inherent meaning built into its structure. “In these word embeddings, there are hundreds of dimensions, and we have no idea what any dimension means,” he says. “We’re really trying to peer into this black box and say, ‘is there structure in here?’”

Word-vectors in the category ‘animals’ (blue circles) are orthogonally projected (light-blue lines) onto the feature subspace for ‘size’ (red line), defined as the vector difference between large−→−− and small−→−− (red circles). The three dimensions in this figure are arbitrary and were chosen via principal component analysis to enhance visualization (the original GloVe word embedding has 300 dimensions, and projection happens in that space). Image: Fedorenko lab

Specifically, they asked whether the semantic scales they had asked their volunteers use were represented in the model. So they looked to see where words in the space lined up along vectors defined by the extremes of those scales. Where did dolphins and tigers fall on line from “big” to “small,” for example? And were they closer together along that line than they were on a line representing danger (“safe” to “dangerous”)?

Across more than 50 sets of world categories and semantic scales, they found that the model had organized words very much like the human volunteers. Dolphins and tigers were judged to be similar in terms of size, but far apart on scales measuring danger or wetness. The model had organized the words in a way that represented many kinds of meaning—and it had done so based entirely on the words’ co-occurrences.

That, Fedorenko says, tells us something about the power of language. “The fact that we can recover so much of this rich semantic information from just these simple word co-occurrence statistics suggests that this is one very powerful source of learning about things that you may not even have direct perceptual experience with.”

Artificial intelligence sheds light on how the brain processes language

In the past few years, artificial intelligence models of language have become very good at certain tasks. Most notably, they excel at predicting the next word in a string of text; this technology helps search engines and texting apps predict the next word you are going to type.

The most recent generation of predictive language models also appears to learn something about the underlying meaning of language. These models can not only predict the word that comes next, but also perform tasks that seem to require some degree of genuine understanding, such as question answering, document summarization, and story completion.

Such models were designed to optimize performance for the specific function of predicting text, without attempting to mimic anything about how the human brain performs this task or understands language. But a new study from MIT neuroscientists suggests the underlying function of these models resembles the function of language-processing centers in the human brain.

Computer models that perform well on other types of language tasks do not show this similarity to the human brain, offering evidence that the human brain may use next-word prediction to drive language processing.

“The better the model is at predicting the next word, the more closely it fits the human brain,” says Nancy Kanwisher, the Walter A. Rosenblith Professor of Cognitive Neuroscience, a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds, and Machines (CBMM), and an author of the new study. “It’s amazing that the models fit so well, and it very indirectly suggests that maybe what the human language system is doing is predicting what’s going to happen next.”

Joshua Tenenbaum, a professor of computational cognitive science at MIT and a member of CBMM and MIT’s Artificial Intelligence Laboratory (CSAIL); and Evelina Fedorenko, the Frederick A. and Carole J. Middleton Career Development Associate Professor of Neuroscience and a member of the McGovern Institute, are the senior authors of the study, which appears this week in the Proceedings of the National Academy of Sciences.

Martin Schrimpf, an MIT graduate student who works in CBMM, is the first author of the paper.

Making predictions

The new, high-performing next-word prediction models belong to a class of models called deep neural networks. These networks contain computational “nodes” that form connections of varying strength, and layers that pass information between each other in prescribed ways.

Over the past decade, scientists have used deep neural networks to create models of vision that can recognize objects as well as the primate brain does. Research at MIT has also shown that the underlying function of visual object recognition models matches the organization of the primate visual cortex, even though those computer models were not specifically designed to mimic the brain.

In the new study, the MIT team used a similar approach to compare language-processing centers in the human brain with language-processing models. The researchers analyzed 43 different language models, including several that are optimized for next-word prediction. These include a model called GPT-3 (Generative Pre-trained Transformer 3), which, given a prompt, can generate text similar to what a human would produce. Other models were designed to perform different language tasks, such as filling in a blank in a sentence.

As each model was presented with a string of words, the researchers measured the activity of the nodes that make up the network. They then compared these patterns to activity in the human brain, measured in subjects performing three language tasks: listening to stories, reading sentences one at a time, and reading sentences in which one word is revealed at a time. These human datasets included functional magnetic resonance (fMRI) data and intracranial electrocorticographic measurements taken in people undergoing brain surgery for epilepsy.

They found that the best-performing next-word prediction models had activity patterns that very closely resembled those seen in the human brain. Activity in those same models was also highly correlated with measures of human behavioral measures such as how fast people were able to read the text.

“We found that the models that predict the neural responses well also tend to best predict human behavior responses, in the form of reading times. And then both of these are explained by the model performance on next-word prediction. This triangle really connects everything together,” Schrimpf says.

“A key takeaway from this work is that language processing is a highly constrained problem: The best solutions to it that AI engineers have created end up being similar, as this paper shows, to the solutions found by the evolutionary process that created the human brain. Since the AI network didn’t seek to mimic the brain directly — but does end up looking brain-like — this suggests that, in a sense, a kind of convergent evolution has occurred between AI and nature,” says Daniel Yamins, an assistant professor of psychology and computer science at Stanford University, who was not involved in the study.

Game changer

One of the key computational features of predictive models such as GPT-3 is an element known as a forward one-way predictive transformer. This kind of transformer is able to make predictions of what is going to come next, based on previous sequences. A significant feature of this transformer is that it can make predictions based on a very long prior context (hundreds of words), not just the last few words.

Scientists have not found any brain circuits or learning mechanisms that correspond to this type of processing, Tenenbaum says. However, the new findings are consistent with hypotheses that have been previously proposed that prediction is one of the key functions in language processing, he says.

“One of the challenges of language processing is the real-time aspect of it,” he says. “Language comes in, and you have to keep up with it and be able to make sense of it in real time.”

The researchers now plan to build variants of these language processing models to see how small changes in their architecture affect their performance and their ability to fit human neural data.

“For me, this result has been a game changer,” Fedorenko says. “It’s totally transforming my research program, because I would not have predicted that in my lifetime we would get to these computationally explicit models that capture enough about the brain so that we can actually leverage them in understanding how the brain works.”

The researchers also plan to try to combine these high-performing language models with some computer models Tenenbaum’s lab has previously developed that can perform other kinds of tasks such as constructing perceptual representations of the physical world.

“If we’re able to understand what these language models do and how they can connect to models which do things that are more like perceiving and thinking, then that can give us more integrative models of how things work in the brain,” Tenenbaum says. “This could take us toward better artificial intelligence models, as well as giving us better models of how more of the brain works and how general intelligence emerges, than we’ve had in the past.”

The research was funded by a Takeda Fellowship; the MIT Shoemaker Fellowship; the Semiconductor Research Corporation; the MIT Media Lab Consortia; the MIT Singleton Fellowship; the MIT Presidential Graduate Fellowship; the Friends of the McGovern Institute Fellowship; the MIT Center for Brains, Minds, and Machines, through the National Science Foundation; the National Institutes of Health; MIT’s Department of Brain and Cognitive Sciences; and the McGovern Institute.

Other authors of the paper are Idan Blank PhD ’16 and graduate students Greta Tuckute, Carina Kauf, and Eghbal Hosseini.

Individual neurons responsible for complex social reasoning in humans identified

This story is adapted from a January 27, 2021 press release from Massachusetts General Hospital.

The ability to understand others’ hidden thoughts and beliefs is an essential component of human social behavior. Now, neuroscientists have for the first time identified specific neurons critical for social reasoning, a cognitive process that requires individuals to acknowledge and predict others’ hidden beliefs and thoughts.

The findings, published in Nature, open new avenues of study into disorders that affect social behavior, according to the authors.

In the study, a team of Harvard Medical School investigators based at Massachusetts General Hospital and colleagues from MIT took a rare look at how individual neurons represent the beliefs of others. They did so by recording neuron activity in patients undergoing neurosurgery to alleviate symptoms of motor disorders such as Parkinson’s disease.

Theory of mind

The researcher team, which included McGovern scientists Ev Fedorenko and Rebecca Saxe, focused on a complex social cognitive process called “theory of mind.” To illustrate this, let’s say a friend appears to be sad on her birthday. One may infer she is sad because she didn’t get a present or she is upset at growing older.

“When we interact, we must be able to form predictions about another person’s unstated intentions and thoughts,” said senior author Ziv Williams, HMS associate professor of neurosurgery at Mass General. “This ability requires us to paint a mental picture of someone’s beliefs, which involves acknowledging that those beliefs may be different from our own and assessing whether they are true or false.”

This social reasoning process develops during early childhood and is fundamental to successful social behavior. Individuals with autism, schizophrenia, bipolar affective disorder, and traumatic brain injuries are believed to have a deficit of theory-of-mind ability.

For the study, 15 patients agreed to perform brief behavioral tasks before undergoing neurosurgery for placement of deep-brain stimulation for motor disorders. Microelectrodes inserted into the dorsomedial prefrontal cortex recorded the behavior of individual neurons as patients listened to short narratives and answered questions about them.

For example, participants were presented with the following scenario to evaluate how they considered another’s belief of reality: “You and Tom see a jar on the table. After Tom leaves, you move the jar to a cabinet. Where does Tom believe the jar to be?”

Social computation

The participants had to make inferences about another’s beliefs after hearing each story. The experiment did not change the planned surgical approach or alter clinical care.

“Our study provides evidence to support theory of mind by individual neurons,” said study first author Mohsen Jamali, HMS instructor in neurosurgery at Mass General. “Until now, it wasn’t clear whether or how neurons were able to perform these social cognitive computations.”

The investigators found that some neurons are specialized and respond only when assessing another’s belief as false, for example. Other neurons encode information to distinguish one person’s beliefs from another’s. Still other neurons create a representation of a specific item, such as a cup or food item, mentioned in the story. Some neurons may multitask and aren’t dedicated solely to social reasoning.

“Each neuron is encoding different bits of information,” Jamali said. “By combining the computations of all the neurons, you get a very detailed representation of the contents of another’s beliefs and an accurate prediction of whether they are true or false.”

Now that scientists understand the basic cellular mechanism that underlies human theory of mind, they have an operational framework to begin investigating disorders in which social behavior is affected, according to Williams.

“Understanding social reasoning is also important to many different fields, such as child development, economics, and sociology, and could help in the development of more effective treatments for conditions such as autism spectrum disorder,” Williams said.

Previous research on the cognitive processes that underlie theory of mind has involved functional MRI studies, where scientists watch which parts of the brain are active as volunteers perform cognitive tasks.

But the imaging studies capture the activity of many thousands of neurons all at once. In contrast, Williams and colleagues recorded the computations of individual neurons. This provided a detailed picture of how neurons encode social information.

“Individual neurons, even within a small area of the brain, are doing very different things, not all of which are involved in social reasoning,” Williams said. “Without delving into the computations of single cells, it’s very hard to build an understanding of the complex cognitive processes underlying human social behavior and how they go awry in mental disorders.”

Adapted from a Mass General news release.

To the brain, reading computer code is not the same as reading language

In some ways, learning to program a computer is similar to learning a new language. It requires learning new symbols and terms, which must be organized correctly to instruct the computer what to do. The computer code must also be clear enough that other programmers can read and understand it.

In spite of those similarities, MIT neuroscientists have found that reading computer code does not activate the regions of the brain that are involved in language processing. Instead, it activates a distributed network called the multiple demand network, which is also recruited for complex cognitive tasks such as solving math problems or crossword puzzles.

However, although reading computer code activates the multiple demand network, it appears to rely more on different parts of the network than math or logic problems do, suggesting that coding does not precisely replicate the cognitive demands of mathematics either.

“Understanding computer code seems to be its own thing. It’s not the same as language, and it’s not the same as math and logic,” says Anna Ivanova, an MIT graduate student and the lead author of the study.

Evelina Fedorenko, the Frederick A. and Carole J. Middleton Career Development Associate Professor of Neuroscience and a member of the McGovern Institute for Brain Research, is the senior author of the paper, which appears today in eLife. Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory and Tufts University were also involved in the study.

Language and cognition

McGovern Investivator Ev Fedorenko in the Martinos Imaging Center at MIT. Photo: Caitlin Cunningham

A major focus of Fedorenko’s research is the relationship between language and other cognitive functions. In particular, she has been studying the question of whether other functions rely on the brain’s language network, which includes Broca’s area and other regions in the left hemisphere of the brain. In previous work, her lab has shown that music and math do not appear to activate this language network.

“Here, we were interested in exploring the relationship between language and computer programming, partially because computer programming is such a new invention that we know that there couldn’t be any hardwired mechanisms that make us good programmers,” Ivanova says.

There are two schools of thought regarding how the brain learns to code, she says. One holds that in order to be good at programming, you must be good at math. The other suggests that because of the parallels between coding and language, language skills might be more relevant. To shed light on this issue, the researchers set out to study whether brain activity patterns while reading computer code would overlap with language-related brain activity.

The two programming languages that the researchers focused on in this study are known for their readability — Python and ScratchJr, a visual programming language designed for children age 5 and older. The subjects in the study were all young adults proficient in the language they were being tested on. While the programmers lay in a functional magnetic resonance (fMRI) scanner, the researchers showed them snippets of code and asked them to predict what action the code would produce.

The researchers saw little to no response to code in the language regions of the brain. Instead, they found that the coding task mainly activated the so-called multiple demand network. This network, whose activity is spread throughout the frontal and parietal lobes of the brain, is typically recruited for tasks that require holding many pieces of information in mind at once, and is responsible for our ability to perform a wide variety of mental tasks.

“It does pretty much anything that’s cognitively challenging, that makes you think hard,” says Ivanova, who was also named one of the McGovern Institute’s rising stars in neuroscience.

Previous studies have shown that math and logic problems seem to rely mainly on the multiple demand regions in the left hemisphere, while tasks that involve spatial navigation activate the right hemisphere more than the left. The MIT team found that reading computer code appears to activate both the left and right sides of the multiple demand network, and ScratchJr activated the right side slightly more than the left. This finding goes against the hypothesis that math and coding rely on the same brain mechanisms.

Effects of experience

The researchers say that while they didn’t identify any regions that appear to be exclusively devoted to programming, such specialized brain activity might develop in people who have much more coding experience.

“It’s possible that if you take people who are professional programmers, who have spent 30 or 40 years coding in a particular language, you may start seeing some specialization, or some crystallization of parts of the multiple demand system,” Fedorenko says. “In people who are familiar with coding and can efficiently do these tasks, but have had relatively limited experience, it just doesn’t seem like you see any specialization yet.”

In a companion paper appearing in the same issue of eLife, a team of researchers from Johns Hopkins University also reported that solving code problems activates the multiple demand network rather than the language regions.

The findings suggest there isn’t a definitive answer to whether coding should be taught as a math-based skill or a language-based skill. In part, that’s because learning to program may draw on both language and multiple demand systems, even if — once learned — programming doesn’t rely on the language regions, the researchers say.

“There have been claims from both camps — it has to be together with math, it has to be together with language,” Ivanova says. “But it looks like computer science educators will have to develop their own approaches for teaching code most effectively.”

The research was funded by the National Science Foundation, the Department of the Brain and Cognitive Sciences at MIT, and the McGovern Institute for Brain Research.

School of Science appoints 12 faculty members to named professorships

The School of Science has awarded chaired appointments to 12 faculty members. These faculty, who are members of the departments of Biology; Brain and Cognitive Sciences; Chemistry; Earth, Atmospheric and Planetary Sciences; and Physics, receive additional support to pursue their research and develop their careers.

Kristin Bergmann, an assistant professor in the Department of Earth, Atmospheric and Planetary Sciences, has been named a D. Reid Weedon, Jr. ’41 Career Development Professor. This is a three-year professorship. Bergmann’s research integrates across sedimentology and stratigraphy, geochemistry, and geobiology to reveal aspects of Earth’s ancient environments. She aims to better constrain Earth’s climate record and carbon cycle during the evolution of early eukaryotes, including animals. Most of her efforts involve reconstructing the details of carbonate rocks, which store much of Earth’s carbon, and thus, are an important component of Earth’s climate system over long timescales.

Joseph Checkelscky is an associate professor in the Department of Physics and has been named a Mitsui Career Development Professor in Contemporary Technology, an appointment he will hold until 2023. His research in quantum materials relies on experimental methods at the intersection of physics, chemistry, and nanoscience. This work is aimed toward synthesizing new crystalline systems that manifest their quantum nature on a macroscopic scale. He aims to realize and study these crystalline systems, which can then serve as platforms for next-generation quantum sensors, quantum communication, and quantum computers.

Mircea Dincă, appointed a W. M. Keck Professor of Energy, is a professor in the Department of Chemistry. This appointment has a five-year term. The topic of Dincă’s research falls largely under the umbrella of energy storage and conversion. His interest in applied energy usage involves creating new organic and inorganic materials that can improve the efficiency of energy collection, storage, and generation while decreasing environmental impacts. Recently, he has developed materials for efficient air-conditioning units and been collaborating with Automobili Lamborghini on electric vehicle design.

Matthew Evans has been appointed to a five-year Mathworks Physics Professorship. Evans, a professor in the Department of Physics, focuses on the instruments used to detect gravitational waves. A member of MIT’s Laser Interferometer Gravitational-Wave Observatory (LIGO) research group, he engineers ways to fine-tune the detection capabilities of the massive ground-based facilities that are being used to identify collisions between black holes and stars in deep space. By removing thermal and quantum limitations, he can increase the sensitivity of the device’s measurements and, thus, its scope of exploration. Evans is also a member of the MIT Kavli Institute for Astrophysics and Space Research.

Evelina Fedorenko is an associate professor in the Department of Brain and Cognitive Sciences and has been named a Frederick A. (1971) and Carole J. Middleton Career Development Professor of Neuroscience. Studying how the brain processes language, Fedorenko uses behavioral studies, brain imaging, neurosurgical recording and stimulation, and computational modelling to better grasp language comprehension and production. In her efforts to elucidate how and what parts of the brain support language processing, she evaluates both typical and atypical brains. Fedorenko is also a member of the McGovern Institute for Brain Research.

Ankur Jain is an assistant professor in the Department of Biology and now a Thomas D. and Virginia W. Cabot Career Development Professor. He will hold this career development appointment for a term of three years. Jain studies how cells organize their contents. Within a cell, there are numerous compartments that form due to weak interactions between biomolecules and exist without an enclosing membrane. By analyzing the biochemistry and biophysics of these compartments, Jain deduces the principles of cellular organization and its dysfunction in human disease. Jain is also a member of the Whitehead Institute for Biomedical Research.

Pulin Li, an assistant professor in the Department of Biology and the Eugene Bell Career Development Professor of Tissue Engineering for the next three years, explores genetic circuitry in building and maintain a tissue. In particular, she investigates how communication circuitry between individual cells can extrapolate into multicellular behavior using both natural and synthetically generated tissues, for which she combines the fields of synthetic and systems biology, biophysics, and bioengineering. A stronger understanding of genetic circuitry could allow for progress in medicine involving embryonic development and tissue engineering. Li is a member of the Whitehead Institute for Biomedical Research.

Elizabeth Nolan, appointed an Ivan R. Cottrell Professor of Immunology, investigates innate immunity and infectious disease. The Department of Chemistry professor, who will hold this chaired professorship for five years, combines experimental chemistry and microbiology to learn about human immune responses to, and interactions with, microbial pathogens. This research includes elucidating the fight between host and pathogen for essential metal nutrients and the functions of host-defense peptides and proteins during infection. With this knowledge, Nolan contributes to fundamental understanding of the host’s ability to combat microbial infection, which may provide new strategies to treat infectious disease.

Leigh “Wiki” Royden is now a Cecil and Ida Green Professor of Geology and Geophysics. The five-year appointment supports her research on the large-scale dynamics and tectonics of the Earth as a professor in the Department of Earth, Atmospheric and Planetary Sciences. Fundamental to geoscience, the tectonics of regional and global systems are closely linked, particularly through the subduction of the plates into the mantle. Royden’s research adds to our understanding a of the structure and dynamics of the crust and the upper portion of the mantle through observation, theory and modeling. This progress has profound implications for global natural events, like mountain building and continental break-up.

Phiala Shanahan has been appointed a Class of 1957 Career Development Professor for three years. Shanahan is an assistant professor in the Department of Physics, where she specializes in theoretical and nuclear physics. Shanahan’s research uses supercomputers to provide insight into the structure of protons and nuclei in terms of their quark and gluon constituents. Her work also informs searches for new physics beyond the current Standard Model, such dark matter. She is a member of the MIT Center for Theoretical Physics.

Xiao Wang, an assistant professor, has also been named a new Thomas D. and Virginia W. Cabot Professor. In the Department of Chemistry, Wang designs and produces novel methods and tools for analyzing the brain. Integrating chemistry, biophysics, and genomics, her work provides higher-resolution imaging and sampling to explain how the brain functions across molecular to system-wide scales. Wang is also a core member of the Broad Institute of MIT and Harvard.

Bin Zhang has been appointed a Pfizer Inc-Gerald Laubach Career Development Professor for a three-year term. Zhang, an assistant professor in the Department of Chemistry, hopes to connect the framework of the human genome sequence with its various functions on various time and spatial scales. By developing theoretical and computational approaches to categorize information about dynamics, organization, and complexity of the genome, he aims to build a quantitative, predictive modelling tool. This tool could even produce 3D representations of details happening at a microscopic level within the body.

Uncovering the functional architecture of a historic brain area

In 1840 a patient named Leborgne was admitted to a hospital near Paris: he was only able repeat the word “Tan.” This loss of speech drew the attention of Paul Broca who, after Leborgne’s death, identified lesions in his frontal lobe in the left hemisphere. These results echoed earlier findings from French neurologist Marc Dax. Now known as “Broca’s area,” the roles of this brain region have been extended to mental functions far beyond speech articulation. So much so, that the underlying functional organization of Broca’s area has become a source of discussion and some confusion.

McGovern Investigator Ev Fedorenko is now calling, in a paper at Trends in Cognitive Sciences, for recognition that Broca’s area consists of functionally distinct, specialized regions, with one sub-region very much dedicated to language processing.

“Broca’s area is one of the first regions you learn about in introductory psychology and neuroscience classes, and arguably laid the foundation for human cognitive neuroscience,” explains Ev Fedorenko, who is also an assistant professor in MIT’s Department of Brain and Cognitive Sciences. “This patch of cortex and its connections with other brain areas and networks provides a microcosm for probing some core questions about the human brain.”

Broca’s area, shown in red. Image: Wikimedia

Language is a uniquely human capability, and thus the discovery of Broca’s area immediately captured the attention of researchers.

“Because language is universal across cultures, but unique to the human species, studying Broca’s area and constraining theories of language accordingly promises to provide a window into one of the central abilities that make humans so special,” explains co-author Idan Blank, a former postdoc at the McGovern Institute who is now an assistant professor of psychology at UCLA.

Function over form

Broca’s area is found in the posterior portion of the left inferior frontal gyrus (LIFG). Arguments and theories abound as to its function. Some consider the region as dedicated to language or syntactic processing, others argue that it processes multiple types of inputs, and still others argue it is working at a high level, implementing working memory and cognitive control. Is Broca’s area a highly specialized circuit, dedicated to the human-specific capacity for language and largely independent from the rest high-level cognition, or is it a CPU-like region, overseeing diverse aspects of the mind and orchestrating their operations?

“Patient investigations and neuroimaging studies have now associated Broca’s region with many processes,” explains Blank. “On the one hand, its language-related functions have expanded far beyond articulation, on the other, non-linguistic functions within Broca’s area—fluid intelligence and problem solving, working memory, goal-directed behavior, inhibition, etc.—are fundamental to ‘all of cognition.’”

While brain anatomy is a common path to defining subregions in Broca’s area, Fedorenko and Blank argue that instead this approach can muddy the water. In fact, the anatomy of the brain, in terms of cortical folds and visible landmarks that originally stuck out to anatomists, vary from individual to individual in terms of their alignment with the underlying functions of brain regions. While these variations might seem small, they potentially have a huge impact on conclusions about functional regions based on traditional analysis methods. This means that the same bit of anatomy (like, say, the posterior portion of a gyrus) could be doing different things in different brains.

“In both investigations of patients with brain damage and much of brain imaging work, a lot of confusion has stemmed from the use of macroanatomical areas (like the inferior frontal gyrus (IFG)) as ‘units of analysis’,” explains Fedorenko. “When some researchers found IFG activation for a syntactic manipulation, and others for a working memory manipulation, the field jumped to the conclusion that syntactic processing relies on working memory. But these effects might actually be arising in totally distinct parts of the IFG.”

The only way to circumvent this problem is to turn to functional data and aggregate information from functionally defined areas across individuals. Using this approach, across four lines of evidence from the last decade, Fedorenko and Blank came to a clear conclusion: Broca’s area is not a monolithic region with a single function, but contains distinct areas, one dedicated to language processing, and another that supports domain-general functions like working memory.

“We just have to stop referring to macroanatomical brain regions (like gyri and sulci, or their parts) when talking about the functional architecture of the brain,” explains Fedorenko. “I am delighted to see that more and more labs across the world are recognizing the inter-individual variability that characterizes the human brain– this shift is putting us on the right path to making fundamental discoveries about how our brain works.”

Indeed, accounting for distinct functional regions, within Broca’s area and elsewhere, seems essential going forward if we are to truly understand the complexity of the human brain.

Word Play

Ev Fedorenko uses the widely translated book “Alice in Wonderland” to test brain responses to different languages.

Language is a uniquely human ability that allows us to build vibrant pictures of non-existent places (think Wonderland or Westeros). How does the brain build mental worlds from words? Can machines do the same? Can we recover this ability after brain injury? These questions require an understanding of how the brain processes language, a fascination for Ev Fedorenko.

“I’ve always been interested in language. Early on, I wanted to found a company that teaches kids languages that share structure — Spanish, French, Italian — in one go,” says Fedorenko, an associate investigator at the McGovern Institute and an assistant professor in brain and cognitive sciences at MIT.

Her road to understanding how thoughts, ideas, emotions, and meaning can be delivered through sound and words became clear when she realized that language was accessible through cognitive neuroscience.

Early on, Fedorenko made a seminal finding that undermined dominant theories of the time. Scientists believed a single network was extracting meaning from all we experience: language, music, math, etc. Evolving separate networks for these functions seemed unlikely, as these capabilities arose recently in human evolution.

Language Regions
Ev Fedorenko has found that language regions of the brain (shown in teal) are sensitive to both word meaning and sentence structure. Image: Ev Fedorenko

But when Fedorenko examined brain activity in subjects while they read or heard sentences in the MRI, she found a network of brain regions that is indeed specialized for language.

“A lot of brain areas, like motor and social systems, were already in place when language emerged during human evolution,” explains Fedorenko. “In some sense, the brain seemed fully occupied. But rather than co-opt these existing systems, the evolution of language in humans involved language carving out specific brain regions.”

Different aspects of language recruit brain regions across the left hemisphere, including Broca’s area and portions of the temporal lobe. Many believe that certain regions are involved in processing word meaning while others unpack the rules of language. Fedorenko and colleagues have however shown that the entire language network is selectively engaged in linguistic tasks, processing both the rules (syntax) and meaning (semantics) of language in the same brain areas.

Semantic Argument

Fedorenko’s lab even challenges the prevailing view that syntax is core to language processing. By gradually degrading sentence structure through local word swaps (see figure), they found that language regions still respond strongly to these degraded sentences, deciphering meaning from them, even as syntax, or combinatorial rules, disappear.

The Fedorenko lab has shown that the brain finds meaning in a sentence, even when “local” words are swapped (2, 3). But when clusters of neighboring words are scrambled (4), the brain struggles to find its meaning.

“A lot of focus in language research has been on structure-building, or building a type of hierarchical graph of the words in a sentence. But actually the language system seems optimized and driven to find rich, representational meaning in a string of words processed together,” explains Fedorenko.

Computing Language

When asked about emerging areas of research, Fedorenko points to the data structures and algorithms underlying linguistic processing. Modern computational models can perform sophisticated tasks, including translation, ever more effectively. Consider Google translate. A decade ago, the system translated one word at a time with laughable results. Now, instead of treating words as providing context for each other, the latest artificial translation systems are performing more accurately. Understanding how they resolve meaning could be very revealing.

“Maybe we can link these models to human neural data to both get insights about linguistic computations in the human brain, and maybe help improve artificial systems by making them more human-like,” says Fedorenko.

She is also trying to understand how the system breaks down, how it over-performs, and even more philosophical questions. Can a person who loses language abilities (with aphasia, for example) recover — a very relevant question given the language-processing network occupies such specific brain regions. How are some unique people able to understand 10, 15 or even more languages? Do we need words to have thoughts?

Using a battery of approaches, Fedorenko seems poised to answer some of these questions.