Tidying up deep neural networks

Visual art has found many ways of representing objects, from the ornate Baroque period to modernist simplicity. Artificial visual systems are somewhat analogous: from relatively simple beginnings inspired by key regions in the visual cortex, recent advances in performance have seen increasing complexity.

“Our overall goal has been to build an accurate, engineering-level model of the visual system, to ‘reverse engineer’ visual intelligence,” explains James DiCarlo, the head of MIT’s Department of Brain and Cognitive Sciences, an investigator in the McGovern Institute for Brain Research and the Center for Brains, Minds, and Machines (CBMM). “But very high-performing ANNs have started to drift away from brain architecture, with complex branching architectures that have no clear parallel in the brain.”

A new model from the DiCarlo lab has re-imposed a brain-like architecture on an object recognition network. The result is a shallow-network architecture with surprisingly high performance, indicating that we can simplify deeper– and more baroque– networks yet retain high performance in artificial learning systems.

“We’ve made two major advances,” explains graduate student Martin Schrimpf, who led the work with Jonas Kubilius at CBMM. “We’ve found a way of checking how well models match the brain, called Brain-Score, and developed a model, CORnet, that moves artificial object recognition, as well as machine learning architectures, forward.”

DiCarlo lab graduate student Martin Schrimpf in the lab. Photo: Kris Brewer

Back to the brain

Deep convolutional artificial neural networks were initially inspired by brain anatomy, and are the leading models in artificial object recognition. Training these feedforward systems on recognizing objects in ImageNet, a large database of images, has allowed performance of ANNs to vastly improve, but at the same time networks have literally branched out, become increasingly complex with hundreds of layers. In contrast, the visual ventral stream, a series of cortical brain regions that unpack object identity, contains a relatively minuscule four key regions. In addition, ANNs are entirely feedforward, while the primate cortical visual system has densely interconnected wiring, in other words, recurrent connectivity. While primate-like object recognition capabilities can be captured through feedforward-only networks, recurrent wiring in the brain has long been suspected, and recently shown in two DiCarlo lab papers led by Kar and Tang respectively, to be important.

DiCarlo and colleagues have now developed CORnet-S, inspired by very complex, state-of-the-art neural networks. CORnet-S has four computational areas, analogous to cortical visual areas (V1, V2, V4, and IT). In addition, CORnet-S contains repeated, or recurrent, connections.

“We really pre-defined layers in the ANN, defining V1, V2, and so on, and introduced feedback and repeated connections” explains Schrimpf. “As a result, we ended up with fewer layers, and less ‘dead space’ that cannot be mapped to the brain. In short, a simpler network.”

Keeping score

To optimize the system, the researchers incorporated quantitative assessment through a new system, Brain-Score.

“Until now, we’ve needed to qualitatively eyeball model performance relative to the brain,” says Schrimpf. “Brain-Score allows us to actually quantitatively evaluate and benchmark models.”

They found that CORnet-S ranks highly on Brain-Score, and is the best performer of all shallow ANNs. Indeed, the system, shallow as it is, rivals the complex, ultra-deep ANNs that currently perform at the highest level.

CORnet was also benchmarked against human performance. To test, for example, whether the system can predict human behavior, 1,472 humans were shown images for 100ms and then asked to identify objects in them. CORnet-S was able to predict the general accuracy of humans to make calls about what they had briefly glimpsed (bear vs. dog etc.). Indeed, CORnet-S is able to predict the behavior, as well as the neural dynamics, of the visual ventral stream, indicating that it is modeling primate-like behavior.

“We thought we’d lose performance by going to a wide, shallow network, but with recurrence, we hardly lost any,” says Schrimpf, “the message for machine learning more broadly, is you can get away without really deep networks.”

Such models of brain processing have benefits for both neuroscience and artificial systems, helping us to understand the elements of image processing by the brain. Neuroscience in turn informs us that features such as recurrence, can be used to improve performance in shallow networks, an important message for artificial intelligence systems more broadly.

“There are clear advantages to the high performing, complex deep networks,” explains DiCarlo, “but it’s possible to rein the network in, using the elegance of the primate brain as a model, and we think this will ultimately lead to other kinds of advantages.”

Differences between deep neural networks and human perception

When your mother calls your name, you know it’s her voice — no matter the volume, even over a poor cell phone connection. And when you see her face, you know it’s hers — if she is far away, if the lighting is poor, or if you are on a bad FaceTime call. This robustness to variation is a hallmark of human perception. On the other hand, we are susceptible to illusions: We might fail to distinguish between sounds or images that are, in fact, different. Scientists have explained many of these illusions, but we lack a full understanding of the invariances in our auditory and visual systems.

Deep neural networks also have performed speech recognition and image classification tasks with impressive robustness to variations in the auditory or visual stimuli. But are the invariances learned by these models similar to the invariances learned by human perceptual systems? A group of MIT researchers has discovered that they are different. They presented their findings yesterday at the 2019 Conference on Neural Information Processing Systems.

The researchers made a novel generalization of a classical concept: “metamers” — physically distinct stimuli that generate the same perceptual effect. The most famous examples of metamer stimuli arise because most people have three different types of cones in their retinae, which are responsible for color vision. The perceived color of any single wavelength of light can be matched exactly by a particular combination of three lights of different colors — for example, red, green, and blue lights. Nineteenth-century scientists inferred from this observation that humans have three different types of bright-light detectors in our eyes. This is the basis for electronic color displays on all of the screens we stare at every day. Another example in the visual system is that when we fix our gaze on an object, we may perceive surrounding visual scenes that differ at the periphery as identical. In the auditory domain, something analogous can be observed. For example, the “textural” sound of two swarms of insects might be indistinguishable, despite differing in the acoustic details that compose them, because they have similar aggregate statistical properties. In each case, the metamers provide insight into the mechanisms of perception, and constrain models of the human visual or auditory systems.

In the current work, the researchers randomly chose natural images and sound clips of spoken words from standard databases, and then synthesized sounds and images so that deep neural networks would sort them into the same classes as their natural counterparts. That is, they generated physically distinct stimuli that are classified identically by models, rather than by humans. This is a new way to think about metamers, generalizing the concept to swap the role of computer models for human perceivers. They therefore called these synthesized stimuli “model metamers” of the paired natural stimuli. The researchers then tested whether humans could identify the words and images.

“Participants heard a short segment of speech and had to identify from a list of words which word was in the middle of the clip. For the natural audio this task is easy, but for many of the model metamers humans had a hard time recognizing the sound,” explains first-author Jenelle Feather, a graduate student in the MIT Department of Brain and Cognitive Sciences (BCS) and a member of the Center for Brains, Minds, and Machines (CBMM). That is, humans would not put the synthetic stimuli in the same class as the spoken word “bird” or the image of a bird. In fact, model metamers generated to match the responses of the deepest layers of the model were generally unrecognizable as words or images by human subjects.

Josh McDermott, associate professor in BCS and investigator in CBMM, makes the following case: “The basic logic is that if we have a good model of human perception, say of speech recognition, then if we pick two sounds that the model says are the same and present these two sounds to a human listener, that human should also say that the two sounds are the same. If the human listener instead perceives the stimuli to be different, this is a clear indication that the representations in our model do not match those of human perception.”

Joining Feather and McDermott on the paper are Alex Durango, a post-baccalaureate student, and Ray Gonzalez, a research assistant, both in BCS.

There is another type of failure of deep networks that has received a lot of attention in the media: adversarial examples (see, for example, “Why did my classifier just mistake a turtle for a rifle?“). These are stimuli that appear similar to humans but are misclassified by a model network (by design — they are constructed to be misclassified). They are complementary to the stimuli generated by Feather’s group, which sound or appear different to humans but are designed to be co-classified by the model network. The vulnerabilities of model networks exposed to adversarial attacks are well-known — face-recognition software might mistake identities; automated vehicles might not recognize pedestrians.

The importance of this work lies in improving models of perception beyond deep networks. Although the standard adversarial examples indicate differences between deep networks and human perceptual systems, the new stimuli generated by the McDermott group arguably represent a more fundamental model failure — they show that generic examples of stimuli classified as the same by a deep network produce wildly different percepts for humans.

The team also figured out ways to modify the model networks to yield metamers that were more plausible sounds and images to humans. As McDermott says, “This gives us hope that we may be able to eventually develop models that pass the metamer test and better capture human invariances.”

“Model metamers demonstrate a significant failure of present-day neural networks to match the invariances in the human visual and auditory systems,” says Feather, “We hope that this work will provide a useful behavioral measuring stick to improve model representations and create better models of human sensory systems.”

Brain science in the Bolivian rainforest

Malinda McPherson headshot
Graduate student Malinda McPherson. Photo: Caitlin Cunningham

Malinda McPherson is a graduate student in Josh McDermott‘s lab, studying how people hear pitch (how high or low a sound is) in both speech and music.

To test the extent to which human audition varies across cultures, McPherson travels with the McDermott lab to Bolivia to study the Tsimane’ — a native Amazonian society with minimal exposure to Western culture.

Their most recent study, published in the journal Current Biology, found a striking variation in perception of musical pitch across cultures.

In this Q&A, we ask McPherson what motivates her research and to describe some of the challenges she has experienced working in the Bolivian rainforest. 

What are you working on now?

Right now, I’m particularly excited about a project that involves working with children; we are trying to better understand how the ability to hear pitch develops with age and experience. Difficulty hearing pitch is one of the first issues that most people with poor or corrected hearing find discouraging, so in addition to simply being an interesting basic component of audition, understanding how pitch perception develops may be useful in engineering assistive hearing devices.

How has your personal background inspired your research?

I’ve been an avid violist for over twenty years and still perform with the Chamber Music Society at MIT. When I was an undergraduate and deciding between a career as a professional musician and a career in science, I found a way to merge the two by working as a research assistant in a lab studying musical creativity. I worked in that lab for three years and was completely hooked. My musical training has definitely helped me design a few experiments!

What was your most challenging experience in Bolivia?  Most rewarding?

The most challenging aspect of our fieldwork in Bolivia is sustaining our intensity over a period of 4-5 weeks.  Every moment is precious, and the pace of work is both exhilarating and exhausting. Despite the long hours of work and travel (by canoe or by truck over very bumpy roads), it is an incredible privilege to meet with and to learn from the Tsimane’. I’ve been picking up some Tsimane’ phrases from the translators with whom we work, and can now have basic conversations with participants and make kids laugh, so that’s a lot of fun. A few children I met my first year greeted me by name when we went back this past year. That was a very special moment!

Translator Manuel Roca Moye (left) with Malinda McPherson and Josh McDermott in a fully loaded canoe. Photo: McDermott lab

What single scientific question do you hope to answer?

I’d be curious to figure out the overlaps and distinctions between how we perceive music versus speech, but I think one of the best aspects of science is that many of the important future questions haven’t been thought of yet!

Perception of musical pitch varies across cultures

People who are accustomed to listening to Western music, which is based on a system of notes organized in octaves, can usually perceive the similarity between notes that are same but played in different registers — say, high C and middle C. However, a longstanding question is whether this a universal phenomenon or one that has been ingrained by musical exposure.

This question has been hard to answer, in part because of the difficulty in finding people who have not been exposed to Western music. Now, a new study led by researchers from MIT and the Max Planck Institute for Empirical Aesthetics has found that unlike residents of the United States, people living in a remote area of the Bolivian rainforest usually do not perceive the similarities between two versions of the same note played at different registers (high or low).

“We’re finding that … there seems to be really striking variation in things that a lot of people would have presumed would be common across cultures and listeners,” says McDermott.

The findings suggest that although there is a natural mathematical relationship between the frequencies of every “C,” no matter what octave it’s played in, the brain only becomes attuned to those similarities after hearing music based on octaves, says Josh McDermott, an associate professor in MIT’s Department of Brain and Cognitive Sciences.

“It may well be that there is a biological predisposition to favor octave relationships, but it doesn’t seem to be realized unless you are exposed to music in an octave-based system,” says McDermott, who is also a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds and Machines.

The study also found that members of the Bolivian tribe, known as the Tsimane’, and Westerners do have a very similar upper limit on the frequency of notes that they can accurately distinguish, suggesting that that aspect of pitch perception may be independent of musical experience and biologically determined.

McDermott is the senior author of the study, which appears in the journal Current Biology on Sept. 19. Nori Jacoby, a former MIT postdoc who is now a group leader at the Max Planck Institute for Empirical Aesthetics, is the paper’s lead author. Other authors are Eduardo Undurraga, an assistant professor at the Pontifical Catholic University of Chile; Malinda McPherson, a graduate student in the Harvard/MIT Program in Speech and Hearing Bioscience and Technology; Joaquin Valdes, a graduate student at the Pontifical Catholic University of Chile; and Tomas Ossandon, an assistant professor at the Pontifical Catholic University of Chile.

Octaves apart

Cross-cultural studies of how music is perceived can shed light on the interplay between biological constraints and cultural influences that shape human perception. McDermott’s lab has performed several such studies with the participation of Tsimane’ tribe members, who live in relative isolation from Western culture and have had little exposure to Western music.

In a study published in 2016, McDermott and his colleagues found that Westerners and Tsimane’ had different aesthetic reactions to chords, or combinations of notes. To Western ears, the combination of C and F# is very grating, but Tsimane’ listeners rated this chord just as likeable as other chords that Westerners would interpret as more pleasant, such as C and G.

Later, Jacoby and McDermott found that both Westerners and Tsimane’ are drawn to musical rhythms composed of simple integer ratios, but the ratios they favor are different, based on which rhythms are more common in the music they listen to.

In their new study, the researchers studied pitch perception using an experimental design in which they play a very simple tune, only two or three notes, and then ask the listener to sing it back. The notes that were played could come from any octave within the range of human hearing, but listeners sang their responses within their vocal range, usually restricted to a single octave.

pitch perception experiment
Eduardo Undurraga, an assistant professor at the Pontifical Catholic University of Chile, runs a musical pitch perception experiment with a member of the Tsimane’ tribe of the Bolivian rainforest. Photo: Josh McDermott

Western listeners, especially those who were trained musicians, tended to reproduce the tune an exact number of octaves above or below what they heard, though they were not specifically instructed to do so. In Western music, the pitch of the same note doubles with each ascending octave, so tones with frequencies of 27.5 hertz, 55 hertz, 110 hertz, 220 hertz, and so on, are all heard as the note A.

Western listeners in the study, all of whom lived in New York or Boston, accurately reproduced sequences such as A-C-A, but in a different register, as though they hear the similarity of notes separated by octaves. However, the Tsimane’ did not.

“The relative pitch was preserved (between notes in the series), but the absolute pitch produced by the Tsimane’ didn’t have any relationship to the absolute pitch of the stimulus,” Jacoby says. “That’s consistent with the idea that perceptual similarity is something that we acquire from exposure to Western music, where the octave is structurally very important.”

The ability to reproduce the same note in different octaves may be honed by singing along with others whose natural registers are different, or singing along with an instrument being played in a different pitch range, Jacoby says.

Limits of perception

The study findings also shed light on the upper limits of pitch perception for humans. It has been known for a long time that Western listeners cannot accurately distinguish pitches above about 4,000 hertz, although they can still hear frequencies up to nearly 20,000 hertz. In a traditional 88-key piano, the highest note is about 4,100 hertz.

People have speculated that the piano was designed to go only that high because of a fundamental limit on pitch perception, but McDermott thought it could be possible that the opposite was true: That is, the limit was culturally influenced by the fact that few musical instruments produce frequencies higher than 4,000 hertz.

The researchers found that although Tsimane’ musical instruments usually have upper limits much lower than 4,000 hertz, Tsimane’ listeners could distinguish pitches very well up to about 4,000 hertz, as evidenced by accurate sung reproductions of those pitch intervals. Above that threshold, their perceptions broke down, very similarly to Western listeners.

“It looks almost exactly the same across groups, so we have some evidence for biological constraints on the limits of pitch,” Jacoby says.

One possible explanation for this limit is that once frequencies reach about 4,000 hertz, the firing rates of the neurons of our inner ear can’t keep up and we lose a critical cue with which to distinguish different frequencies.

“The new study contributes to the age-long debate about the interplays between culture and biological constraints in music,” says Daniel Pressnitzer, a senior research scientist at Paris Descartes University, who was not involved in the research. “This unique, precious, and extensive dataset demonstrates both striking similarities and unexpected differences in how Tsimane’ and Western listeners perceive or conceive musical pitch.”

Jacoby and McDermott now hope to expand their cross-cultural studies to other groups who have had little exposure to Western music, and to perform more detailed studies of pitch perception among the Tsimane’.

Such studies have already shown the value of including research participants other than the Western-educated, relatively wealthy college undergraduates who are the subjects of most academic studies on perception, McDermott says. These broader studies allow researchers to tease out different elements of perception that cannot be seen when examining only a single, homogenous group.

“We’re finding that there are some cross-cultural similarities, but there also seems to be really striking variation in things that a lot of people would have presumed would be common across cultures and listeners,” McDermott says. “These differences in experience can lead to dissociations of different aspects of perception, giving you clues to what the parts of the perceptual system are.”

The research was funded by the James S. McDonnell Foundation, the National Institutes of Health, and the Presidential Scholar in Society and Neuroscience Program at Columbia University.

Finding the brain’s compass

The world is constantly bombarding our senses with information, but the ways in which our brain extracts meaning from this information remains elusive. How do neurons transform raw visual input into a mental representation of an object – like a chair or a dog?

In work published today in Nature Neuroscience, MIT neuroscientists have identified a brain circuit in mice that distills “high-dimensional” complex information about the environment into a simple abstract object in the brain.

“There are no degree markings in the external world, our current head direction has to be extracted, computed, and estimated by the brain,” explains Ila Fiete, an associate member of the McGovern Institute and senior author of the paper. “The approaches we used allowed us to demonstrate the emergence of a low-dimensional concept, essentially an abstract compass in the brain.”

This abstract compass, according to the researchers, is a one-dimensional ring that represents the current direction of the head relative to the external world.

Schooling fish

Trying to show that a data cloud has a simple shape, like a ring, is a bit like watching a school of fish. By tracking one or two sardines, you might not see a pattern. But if you could map all of the sardines, and transform the noisy dataset into points representing the positions of the whole school of sardines over time, and where each fish is relative to its neighbors, a pattern would emerge. This model would reveal a ring shape, a simple shape formed by the activity of hundreds of individual fish.

Fiete, who is also an associate professor in MIT’s Department of Brain and Cognitive Sciences, used a similar approach, called topological modeling, to transform the activity of large populations of noisy neurons into a data cloud the shape of a ring.

Simple and persistent ring

Previous work in fly brains revealed a physical ellipsoid ring of neurons representing changes in the direction of the fly’s head, and researchers suspected that such a system might also exist in mammals.

In this new mouse study, Fiete and her colleagues measured hours of neural activity from scores of neurons in the anterodorsal thalamic nucleus (ADN) – a region believed to play a role in spatial navigation – as the animals moved freely around their environment. They mapped how the neurons in the ADN circuit fired as the animal’s head changed direction.

Together these data points formed a cloud in the shape of a simple and persistent ring.

“In the absence of this ring,” Fiete explains, “we would be lost in the world.”

“This tells us a lot about how neural networks are organized in the brain,” explains Edvard Moser, Director of the Kavli Institute of Systems Neuroscience in Norway, who was not involved in the study. “Past data have indirectly pointed towards such a ring-like organization but only now has it been possible, with the right cell numbers and methods, to demonstrate it convincingly,” says Moser.

Their method for characterizing the shape of the data cloud allowed Fiete and colleagues to determine which variable the circuit was devoted to representing, and to decode this variable over time, using only the neural responses.

“The animal’s doing really complicated stuff,” explains Fiete, “but this circuit is devoted to integrating the animal’s speed along a one-dimensional compass that encodes head direction,” explains Fiete. “Without a manifold approach, which captures the whole state space, you wouldn’t know that this circuit of thousands of neurons is encoding only this one aspect of the complex behavior, and not encoding any other variables at the same time.”

Even during sleep, when the circuit is not being bombarded with external information, this circuit robustly traces out the same one-dimensional ring, as if dreaming of past head direction trajectories.

Further analysis revealed that the ring acts an attractor. If neurons stray off trajectory, they are drawn back to it, quickly correcting the system. This attractor property of the ring means that the representation of head direction in abstract space is reliably stable over time, a key requirement if we are to understand and maintain a stable sense of where our head is relative to the world around us.

“In the absence of this ring,” Fiete explains, “we would be lost in the world.”

Shaping the future

Fiete’s work provides a first glimpse into how complex sensory information is distilled into a simple concept in the mind, and how that representation autonomously corrects errors, making it exquisitely stable.

But the implications of this study go beyond coding of head direction.

“Similar organization is probably present for other cognitive functions so the paper is likely to inspire numerous new studies,” says Moser.

Fiete sees these analyses and related studies carried out by colleagues at the Norwegian University of Science and Technology, Princeton University, the Weitzman Institute, and elsewhere as fundamental to the future of neural decoding studies.

With this approach, she explains, it is possible to extract abstract representations of the mind from the brain, potentially even thoughts and dreams.

“We’ve found that the brain deconstructs and represents complex things in the world with simple shapes,” explains Fiete. “Manifold-level analysis can help us to find those shapes, and they almost certainly exist beyond head direction circuits.”

How expectation influences perception

For decades, research has shown that our perception of the world is influenced by our expectations. These expectations, also called “prior beliefs,” help us make sense of what we are perceiving in the present, based on similar past experiences. Consider, for instance, how a shadow on a patient’s X-ray image, easily missed by a less experienced intern, jumps out at a seasoned physician. The physician’s prior experience helps her arrive at the most probable interpretation of a weak signal.

The process of combining prior knowledge with uncertain evidence is known as Bayesian integration and is believed to widely impact our perceptions, thoughts, and actions. Now, MIT neuroscientists have discovered distinctive brain signals that encode these prior beliefs. They have also found how the brain uses these signals to make judicious decisions in the face of uncertainty.

“How these beliefs come to influence brain activity and bias our perceptions was the question we wanted to answer,” says Mehrdad Jazayeri, the Robert A. Swanson Career Development Professor of Life Sciences, a member of MIT’s McGovern Institute for Brain Research, and the senior author of the study.

The researchers trained animals to perform a timing task in which they had to reproduce different time intervals. Performing this task is challenging because our sense of time is imperfect and can go too fast or too slow. However, when intervals are consistently within a fixed range, the best strategy is to bias responses toward the middle of the range. This is exactly what animals did. Moreover, recording from neurons in the frontal cortex revealed a simple mechanism for Bayesian integration: Prior experience warped the representation of time in the brain so that patterns of neural activity associated with different intervals were biased toward those that were within the expected range.

MIT postdoc Hansem Sohn, former postdoc Devika Narain, and graduate student Nicolas Meirhaeghe are the lead authors of the study, which appears in the July 15 issue of Neuron.

Ready, set, go

Statisticians have known for centuries that Bayesian integration is the optimal strategy for handling uncertain information. When we are uncertain about something, we automatically rely on our prior experiences to optimize behavior.

“If you can’t quite tell what something is, but from your prior experience you have some expectation of what it ought to be, then you will use that information to guide your judgment,” Jazayeri says. “We do this all the time.”

In this new study, Jazayeri and his team wanted to understand how the brain encodes prior beliefs, and put those beliefs to use in the control of behavior. To that end, the researchers trained animals to reproduce a time interval, using a task called “ready-set-go.” In this task, animals measure the time between two flashes of light (“ready” and “set”) and then generate a “go” signal by making a delayed response after the same amount of time has elapsed.

They trained the animals to perform this task in two contexts. In the “Short” scenario, intervals varied between 480 and 800 milliseconds, and in the “Long” context, intervals were between 800 and 1,200 milliseconds. At the beginning of the task, the animals were given the information about the context (via a visual cue), and therefore knew to expect intervals from either the shorter or longer range.

Jazayeri had previously shown that humans performing this task tend to bias their responses toward the middle of the range. Here, they found that animals do the same. For example, if animals believed the interval would be short, and were given an interval of 800 milliseconds, the interval they produced was a little shorter than 800 milliseconds. Conversely, if they believed it would be longer, and were given the same 800-millisecond interval, they produced an interval a bit longer than 800 milliseconds.

“Trials that were identical in almost every possible way, except the animal’s belief led to different behaviors,” Jazayeri says. “That was compelling experimental evidence that the animal is relying on its own belief.”

Once they had established that the animals relied on their prior beliefs, the researchers set out to find how the brain encodes prior beliefs to guide behavior. They recorded activity from about 1,400 neurons in a region of the frontal cortex, which they have previously shown is involved in timing.

During the “ready-set” epoch, the activity profile of each neuron evolved in its own way, and about 60 percent of the neurons had different activity patterns depending on the context (Short versus Long). To make sense of these signals, the researchers analyzed the evolution of neural activity across the entire population over time, and found that prior beliefs bias behavioral responses by warping the neural representation of time toward the middle of the expected range.

“We have never seen such a concrete example of how the brain uses prior experience to modify the neural dynamics by which it generates sequences of neural activities, to correct for its own imprecision. This is the unique strength of this paper: bringing together perception, neural dynamics, and Bayesian computation into a coherent framework, supported by both theory and measurements of behavior and neural activities,” says Mate Lengyel, a professor of computational neuroscience at Cambridge University, who was not involved in the study.

Embedded knowledge

Researchers believe that prior experiences change the strength of connections between neurons. The strength of these connections, also known as synapses, determines how neurons act upon one another and constrains the patterns of activity that a network of interconnected neurons can generate. The finding that prior experiences warp the patterns of neural activity provides a window onto how experience alters synaptic connections. “The brain seems to embed prior experiences into synaptic connections so that patterns of brain activity are appropriately biased,” Jazayeri says.

As an independent test of these ideas, the researchers developed a computer model consisting of a network of neurons that could perform the same ready-set-go task. Using techniques borrowed from machine learning, they were able to modify the synaptic connections and create a model that behaved like the animals.

These models are extremely valuable as they provide a substrate for the detailed analysis of the underlying mechanisms, a procedure that is known as “reverse-engineering.” Remarkably, reverse-engineering the model revealed that it solved the task the same way the monkeys’ brain did. The model also had a warped representation of time according to prior experience.

The researchers used the computer model to further dissect the underlying mechanisms using perturbation experiments that are currently impossible to do in the brain. Using this approach, they were able to show that unwarping the neural representations removes the bias in the behavior. This important finding validated the critical role of warping in Bayesian integration of prior knowledge.

The researchers now plan to study how the brain builds up and slowly fine-tunes the synaptic connections that encode prior beliefs as an animal is learning to perform the timing task.

The research was funded by the Center for Sensorimotor Neural Engineering, the Netherlands Scientific Organization, the Marie Sklodowska Curie Reintegration Grant, the National Institutes of Health, the Sloan Foundation, the Klingenstein Foundation, the Simons Foundation, the McKnight Foundation, and the McGovern Institute.

Evelina Fedorenko

Exploring Language

Evelina (Ev) Fedorenko aims to understand how the language system works in the brain. Her lab is unpacking the internal architecture of the brain’s language system and exploring the relationship between language and various cognitive, perceptual, and motor systems. To do this, her lab employs a range of approaches – from brain imaging to computational modeling – and works with a diverse populations, including polyglots and individuals with atypical brains. Language is a quintessential human ability, but the function that language serves has been debated for centuries. Fedorenko argues that language serves is primarily as a tool for communication, contrary to a prominent view that language is essential for thinking.

Ultimately, this cutting-edge work is uncovering the computations and representations that fuel language processing in the brain.

Antenna-like inputs unexpectedly active in neural computation

Most neurons have many branching extensions called dendrites that receive input from thousands of other neurons. Dendrites aren’t just passive information-carriers, however. According to a new study from MIT, they appear to play a surprisingly large role in neurons’ ability to translate incoming signals into electrical activity.

Neuroscientists had previously suspected that dendrites might be active only rarely, under specific circumstances, but the MIT team found that dendrites are nearly always active when the main cell body of the neuron is active.

“It seems like dendritic spikes are an intrinsic feature of how neurons in our brain can compute information. They’re not a rare event,” says Lou Beaulieu-Laroche, an MIT graduate student and the lead author of the study. “All the neurons that we looked at had these dendritic spikes, and they had dendritic spikes very frequently.”

The findings suggest that the role of dendrites in the brain’s computational ability is much larger than had previously been thought, says Mark Harnett, who is the Fred and Carole Middleton Career Development Assistant Professor of Brain and Cognitive Sciences, a member of the McGovern Institute for Brain Research, and the senior author of the paper.

“It’s really quite different than how the field had been thinking about this,” he says. “This is evidence that dendrites are actively engaged in producing and shaping the outputs of neurons.”

Graduate student Enrique Toloza and technical associate Norma Brown are also authors of the paper, which appears in Neuron on June 6.

“A far-flung antenna”

Dendrites receive input from many other neurons and carry those signals to the cell body, also called the soma. If stimulated enough, a neuron fires an action potential — an electrical impulse that spreads to other neurons. Large networks of these neurons communicate with each other to perform complex cognitive tasks such as producing speech.

Through imaging and electrical recording, neuroscientists have learned a great deal about the anatomical and functional differences between different types of neurons in the brain’s cortex, but little is known about how they incorporate dendritic inputs and decide whether to fire an action potential. Dendrites give neurons their characteristic branching tree shape, and the size of the “dendritic arbor” far exceeds the size of the soma.

“It’s an enormous, far-flung antenna that’s listening to thousands of synaptic inputs distributed in space along that branching structure from all the other neurons in the network,” Harnett says.

Some neuroscientists have hypothesized that dendrites are active only rarely, while others thought it possible that dendrites play a more central role in neurons’ overall activity. Until now, it has been difficult to test which of these ideas is more accurate, Harnett says.

To explore dendrites’ role in neural computation, the MIT team used calcium imaging to simultaneously measure activity in both the soma and dendrites of individual neurons in the visual cortex of the brain. Calcium flows into neurons when they are electrically active, so this measurement allowed the researchers to compare the activity of dendrites and soma of the same neuron. The imaging was done while mice performed simple tasks such as running on a treadmill or watching a movie.

Unexpectedly, the researchers found that activity in the soma was highly correlated with dendrite activity. That is, when the soma of a particular neuron was active, the dendrites of that neuron were also active most of the time. This was particularly surprising because the animals weren’t performing any kind of cognitively demanding task, Harnett says.

“They weren’t engaged in a task where they had to really perform and call upon cognitive processes or memory. This is pretty simple, low-level processing, and already we have evidence for active dendritic processing in almost all the neurons,” he says. “We were really surprised to see that.”

Evolving patterns

The researchers don’t yet know precisely how dendritic input contributes to neurons’ overall activity, or what exactly the neurons they studied are doing.

“We know that some of those neurons respond to some visual stimuli, but we don’t necessarily know what those individual neurons are representing. All we can say is that whatever the neuron is representing, the dendrites are actively participating in that,” Beaulieu-Laroche says.

While more work remains to determine exactly how the activity in the dendrites and the soma are linked, “it is these tour-de-force in vivo measurements that are critical for explicitly testing hypotheses regarding electrical signaling in neurons,” says Marla Feller, a professor of neurobiology at the University of California at Berkeley, who was not involved in the research.

The MIT team now plans to investigate how dendritic activity contributes to overall neuronal function by manipulating dendrite activity and then measuring how it affects the activity of the cell body, Harnett says. They also plan to study whether the activity patterns they observed evolve as animals learn a new task.

“One hypothesis is that dendritic activity will actually sharpen up for representing features of a task you taught the animals, and all the other dendritic activity, and all the other somatic activity, is going to get dampened down in the rest of the cortical cells that are not involved,” Harnett says.

The research was funded by the Natural Sciences and Engineering Research Council of Canada and the U.S. National Institutes of Health.

How we make complex decisions

When making a complex decision, we often break the problem down into a series of smaller decisions. For example, when deciding how to treat a patient, a doctor may go through a hierarchy of steps — choosing a diagnostic test, interpreting the results, and then prescribing a medication.

Making hierarchical decisions is straightforward when the sequence of choices leads to the desired outcome. But when the result is unfavorable, it can be tough to decipher what went wrong. For example, if a patient doesn’t improve after treatment, there are many possible reasons why: Maybe the diagnostic test is accurate only 75 percent of the time, or perhaps the medication only works for 50 percent of the patients. To decide what do to next, the doctor must take these probabilities into account.

In a new study, MIT neuroscientists explored how the brain reasons about probable causes of failure after a hierarchy of decisions. They discovered that the brain performs two computations using a distributed network of areas in the frontal cortex. First, the brain computes confidence over the outcome of each decision to figure out the most likely cause of a failure, and second, when it is not easy to discern the cause, the brain makes additional attempts to gain more confidence.

“Creating a hierarchy in one’s mind and navigating that hierarchy while reasoning about outcomes is one of the exciting frontiers of cognitive neuroscience,” says Mehrdad Jazayeri, the Robert A. Swanson Career Development Professor of Life Sciences, a member of MIT’s McGovern Institute for Brain Research, and the senior author of the study.

MIT graduate student Morteza Sarafyzad is the lead author of the paper, which appears in Science on May 16.

Hierarchical reasoning

Previous studies of decision-making in animal models have focused on relatively simple tasks. One line of research has focused on how the brain makes rapid decisions by evaluating momentary evidence. For example, a large body of work has characterized the neural substrates and mechanisms that allow animals to categorize unreliable stimuli on a trial-by-trial basis. Other research has focused on how the brain chooses among multiple options by relying on previous outcomes across multiple trials.

“These have been very fruitful lines of work,” Jazayeri says. “However, they really are the tip of the iceberg of what humans do when they make decisions. As soon as you put yourself in any real decision-making situation, be it choosing a partner, choosing a car, deciding whether to take this drug or not, these become really complicated decisions. Oftentimes there are many factors that influence the decision, and those factors can operate at different timescales.”

The MIT team devised a behavioral task that allowed them to study how the brain processes information at multiple timescales to make decisions. The basic design was that animals would make one of two eye movements depending on whether the time interval between two flashes of light was shorter or longer than 850 milliseconds.

A twist required the animals to solve the task through hierarchical reasoning: The rule that determined which of the two eye movements had to be made switched covertly after 10 to 28 trials. Therefore, to receive reward, the animals had to choose the correct rule, and then make the correct eye movement depending on the rule and interval. However, because the animals were not instructed about the rule switches, they could not straightforwardly determine whether an error was caused because they chose the wrong rule or because they misjudged the interval.

The researchers used this experimental design to probe the computational principles and neural mechanisms that support hierarchical reasoning. Theory and behavioral experiments in humans suggest that reasoning about the potential causes of errors depends in large part on the brain’s ability to measure the degree of confidence in each step of the process. “One of the things that is thought to be critical for hierarchical reasoning is to have some level of confidence about how likely it is that different nodes [of a hierarchy] could have led to the negative outcome,” Jazayeri says.

The researchers were able to study the effect of confidence by adjusting the difficulty of the task. In some trials, the interval between the two flashes was much shorter or longer than 850 milliseconds. These trials were relatively easy and afforded a high degree of confidence. In other trials, the animals were less confident in their judgments because the interval was closer to the boundary and difficult to discriminate.

As they had hypothesized, the researchers found that the animals’ behavior was influenced by their confidence in their performance. When the interval was easy to judge, the animals were much quicker to switch to the other rule when they found out they were wrong. When the interval was harder to judge, the animals were less confident in their performance and applied the same rule a few more times before switching.

“They know that they’re not confident, and they know that if they’re not confident, it’s not necessarily the case that the rule has changed. They know they might have made a mistake [in their interval judgment],” Jazayeri says.

Decision-making circuit

By recording neural activity in the frontal cortex just after each trial was finished, the researchers were able to identify two regions that are key to hierarchical decision-making. They found that both of these regions, known as the anterior cingulate cortex (ACC) and dorsomedial frontal cortex (DMFC), became active after the animals were informed about an incorrect response. When the researchers analyzed the neural activity in relation to the animals’ behavior, it became clear that neurons in both areas signaled the animals’ belief about a possible rule switch. Notably, the activity related to animals’ belief was “louder” when animals made a mistake after an easy trial, and after consecutive mistakes.

The researchers also found that while these areas showed similar patterns of activity, it was activity in the ACC in particular that predicted when the animal would switch rules, suggesting that ACC plays a central role in switching decision strategies. Indeed, the researchers found that direct manipulation of neural activity in ACC was sufficient to interfere with the animals’ rational behavior.

“There exists a distributed circuit in the frontal cortex involving these two areas, and they seem to be hierarchically organized, just like the task would demand,” Jazayeri says.

Daeyeol Lee, a professor of neuroscience, psychology, and psychiatry at Yale School of Medicine, says the study overcomes what has been a major obstacle in studying this kind of decision-making, namely, a lack of animal models to study the dynamics of brain activity at single-neuron resolution.

“Sarafyazd and Jazayeri have developed an elegant decision-making task that required animals to evaluate multiple types of evidence, and identified how the two separate regions in the medial frontal cortex are critically involved in handling different sources of errors in decision making,” says Lee, who was not involved in the research. “This study is a tour de force in both rigor and creativity, and peels off another layer of mystery about the prefrontal cortex.”

Algorithms of intelligence

The following post is adapted from a story featured in a recent Brain Scan newsletter.

Machine vision systems are more and more common in everyday life, from social media to self-driving cars, but training artificial neural networks to “see” the world as we do—distinguishing cyclists from signposts—remains challenging. Will artificial neural networks ever decode the world as exquisitely as humans? Can we refine these models and influence perception in a person’s brain just by activating individual, selected neurons? The DiCarlo lab, including CBMM postdocs Kohitij Kar and Pouya Bashivan, are finding that we are surprisingly close to answering “yes” to such questions, all in the context of accelerated insights into artificial intelligence at the McGovern Institute for Brain Research, CBMM, and the Quest for Intelligence at MIT.

Precision Modeling

Beyond light hitting the retina, the recognition process that unfolds in the visual cortex is key to truly “seeing” the surrounding world. Information is decoded through the ventral visual stream, cortical brain regions that progressively build a more accurate, fine-grained, and accessible representation of the objects around us. Artificial neural networks have been modeled on these elegant cortical systems, and the most successful models, deep convolutional neural networks (DCNNs), can now decode objects at levels comparable to the primate brain. However, even leading DCNNs have problems with certain challenging images, presumably due to shadows, clutter, and other visual noise. While there’s no simple feature that unites all challenging images, the quest is on to tackle such images to attain precise recognition at a level commensurate with human object recognition.

“One next step is to couple this new precision tool with our emerging understanding of how neural patterns underlie object perception. This might allow us to create arrangements of pixels that look nothing like, for example, a cat, but that can fool the brain into thinking it’s seeing a cat.”- James DiCarlo

In a recent push, Kar and DiCarlo demonstrated that adding feedback connections, currently missing in most DCNNs, allows the system to better recognize objects in challenging situations, even those where a human can’t articulate why recognition is an issue for feedforward DCNNs. They also found that this recurrent circuit seems critical to primate success rates in performing this task. This is incredibly important for systems like self-driving cars, where the stakes for artificial visual systems are high, and faithful recognition is a must.

Now you see it

As artificial object recognition systems have become more precise in predicting neural activity, the DiCarlo lab wondered what such precision might allow: could they use their system to not only predict, but to control specific neuronal activity?

To demonstrate the power of their models, Bashivan, Kar, and colleagues zeroed in on targeted neurons in the brain. In a paper published in Science, they used an artificial neural network to generate a random-looking group of pixels that, when shown to an animal, activated the team’s target, a target they called “one hot neuron.” In other words, they showed the brain a synthetic pattern, and the pixels in the pattern precisely activated targeted neurons while other neurons remained relatively silent.

These findings show how the knowledge in today’s artificial neural network models might one day be used to noninvasively influence brain states with neural resolution. Such precise systems would be useful as we look to the future, toward visual prosthetics for the blind. Such a precise model of the ventral visual stream would have been incon-ceivable not so long ago, and all eyes are on where McGovern researchers will take these technologies in the coming years.