Nancy Kanwisher

Architecture of the Mind

What is the nature of the human mind? Philosophers have debated this question for centuries, but Nancy Kanwisher approaches this question empirically, using brain imaging to look for components of the human mind that reside in particular regions of the brain. Her lab has identified cortical regions that are selectively engaged in the perception of faces, places, and bodies, and other regions specialized for uniquely human functions including the music, language, and thinking about other people’s thoughts. More recently, her lab has begun using artificial neural networks to unpack these findings and examine why, from a computational standpoint, the brain exhibits functional specification in the first place.

Tomaso Poggio

Engineering Intelligence

Tomaso Poggio is one of the founders of computational neuroscience. He pioneered a model of the fly’s visual system as well as of human stereovision. His research has always been interdisciplinary, bridging brains and computers. It is now focused on the mathematics of deep learning and on the computational neuroscience of the visual cortex. Poggio also introduced using an approach called regularization theory to computational vision, made key contributions to the biophysics of computation and to learning theory, and developed an influential model of recognition in the visual cortex. Research in the Poggio lab is guided by the belief that understanding learning is at the heart of understanding both biological and artificial intelligence. Learning is therefore the route to understanding how the human brain works and for making intelligent machines.

Mark Harnett

Listening to Neurons

Mark Harnett studies how the biophysical features of individual neurons, including ion channels, receptors, and membrane electrical properties, endow neural circuits with the ability to process information and perform the complex computations that underlie behavior. As part of this work, the Harnett lab was the first to describe the physiological properties of human dendrites, the elaborate tree-like structures through which neurons receive the vast majority of their synaptic inputs. Harnett also examines how computations are instantiated in neural circuits to produce complex behaviors such as spatial navigation.

Virtual Tour of Harnett Lab

Satrajit Ghosh

Personalized Medicine

A fundamental problem in psychiatry is that there are no biological markers for diagnosing mental illness or for indicating how best to treat it. Treatment decisions are based entirely on symptoms, and doctors and their patients will typically try one treatment, then if it does not work, try another, and perhaps another. Satrajit Ghosh hopes to change this picture, and his research suggests that individual brain scans and speaking patterns can hold valuable information for guiding psychiatrists and patients. His research group develops novel analytic platforms that use such information to create robust, predictive models around human health. Current areas include depression, suicide, anxiety disorders, autism, Parkinson’s disease, and brain tumors.

James DiCarlo

Rapid Recognition

DiCarlo’s research goal is to reverse engineer the brain mechanisms that underlie human visual intelligence. He and his collaborators have revealed how population image transformations carried out by a deep stack of interconnected neocortical brain areas — called the primate ventral visual stream — are effortlessly able to extract object identity from visual images. His team uses a combination of large-scale neurophysiology, brain imaging, direct neural perturbation methods, and machine learning methods to build and test neurally-mechanistic computational models of the ventral visual stream and its support of cognition and behavior. Such an engineering-based understanding is likely to lead to new artificial vision and artificial intelligence approaches, new brain-machine interfaces to restore or augment lost senses, and a new foundation to ameliorate disorders of the mind.

Machines that learn language more like kids do

Children learn language by observing their environment, listening to the people around them, and connecting the dots between what they see and hear. Among other things, this helps children establish their language’s word order, such as where subjects and verbs fall in a sentence.

In computing, learning language is the task of syntactic and semantic parsers. These systems are trained on sentences annotated by humans that describe the structure and meaning behind words. Parsers are becoming increasingly important for web searches, natural-language database querying, and voice-recognition systems such as Alexa and Siri. Soon, they may also be used for home robotics.

But gathering the annotation data can be time-consuming and difficult for less common languages. Additionally, humans don’t always agree on the annotations, and the annotations themselves may not accurately reflect how people naturally speak.

In a paper being presented at this week’s Empirical Methods in Natural Language Processing conference, MIT researchers describe a parser that learns through observation to more closely mimic a child’s language-acquisition process, which could greatly extend the parser’s capabilities. To learn the structure of language, the parser observes captioned videos, with no other information, and associates the words with recorded objects and actions. Given a new sentence, the parser can then use what it’s learned about the structure of the language to accurately predict a sentence’s meaning, without the video.

This “weakly supervised” approach — meaning it requires limited training data — mimics how children can observe the world around them and learn language, without anyone providing direct context. The approach could expand the types of data and reduce the effort needed for training parsers, according to the researchers. A few directly annotated sentences, for instance, could be combined with many captioned videos, which are easier to come by, to improve performance.

In the future, the parser could be used to improve natural interaction between humans and personal robots. A robot equipped with the parser, for instance, could constantly observe its environment to reinforce its understanding of spoken commands, including when the spoken sentences aren’t fully grammatical or clear. “People talk to each other in partial sentences, run-on thoughts, and jumbled language. You want a robot in your home that will adapt to their particular way of speaking … and still figure out what they mean,” says co-author Andrei Barbu, a researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Center for Brains, Minds, and Machines (CBMM) within MIT’s McGovern Institute.

The parser could also help researchers better understand how young children learn language. “A child has access to redundant, complementary information from different modalities, including hearing parents and siblings talk about the world, as well as tactile information and visual information, [which help him or her] to understand the world,” says co-author Boris Katz, a principal research scientist and head of the InfoLab Group at CSAIL. “It’s an amazing puzzle, to process all this simultaneous sensory input. This work is part of bigger piece to understand how this kind of learning happens in the world.”

Co-authors on the paper are: first author Candace Ross, a graduate student in the Department of Electrical Engineering and Computer Science and CSAIL, and a researcher in CBMM; Yevgeni Berzak PhD ’17, a postdoc in the Computational Psycholinguistics Group in the Department of Brain and Cognitive Sciences; and CSAIL graduate student Battushig Myanganbayar.

Visual learner

For their work, the researchers combined a semantic parser with a computer-vision component trained in object, human, and activity recognition in video. Semantic parsers are generally trained on sentences annotated with code that ascribes meaning to each word and the relationships between the words. Some have been trained on still images or computer simulations.

The new parser is the first to be trained using video, Ross says. In part, videos are more useful in reducing ambiguity. If the parser is unsure about, say, an action or object in a sentence, it can reference the video to clear things up. “There are temporal components — objects interacting with each other and with people — and high-level properties you wouldn’t see in a still image or just in language,” Ross says.

The researchers compiled a dataset of about 400 videos depicting people carrying out a number of actions, including picking up an object or putting it down, and walking toward an object. Participants on the crowdsourcing platform Mechanical Turk then provided 1,200 captions for those videos. They set aside 840 video-caption examples for training and tuning, and used 360 for testing. One advantage of using vision-based parsing is “you don’t need nearly as much data — although if you had [the data], you could scale up to huge datasets,” Barbu says.

In training, the researchers gave the parser the objective of determining whether a sentence accurately describes a given video. They fed the parser a video and matching caption. The parser extracts possible meanings of the caption as logical mathematical expressions. The sentence, “The woman is picking up an apple,” for instance, may be expressed as: λxy. woman x, pick_up x y, apple y.

Those expressions and the video are inputted to the computer-vision algorithm, called “Sentence Tracker,” developed by Barbu and other researchers. The algorithm looks at each video frame to track how objects and people transform over time, to determine if actions are playing out as described. In this way, it determines if the meaning is possibly true of the video.

Connecting the dots

The expression with the most closely matching representations for objects, humans, and actions becomes the most likely meaning of the caption. The expression, initially, may refer to many different objects and actions in the video, but the set of possible meanings serves as a training signal that helps the parser continuously winnow down possibilities. “By assuming that all of the sentences must follow the same rules, that they all come from the same language, and seeing many captioned videos, you can narrow down the meanings further,” Barbu says.

In short, the parser learns through passive observation: To determine if a caption is true of a video, the parser by necessity must identify the highest probability meaning of the caption. “The only way to figure out if the sentence is true of a video [is] to go through this intermediate step of, ‘What does the sentence mean?’ Otherwise, you have no idea how to connect the two,” Barbu explains. “We don’t give the system the meaning for the sentence. We say, ‘There’s a sentence and a video. The sentence has to be true of the video. Figure out some intermediate representation that makes it true of the video.’”

The training produces a syntactic and semantic grammar for the words it’s learned. Given a new sentence, the parser no longer requires videos, but leverages its grammar and lexicon to determine sentence structure and meaning.

Ultimately, this process is learning “as if you’re a kid,” Barbu says. “You see world around you and hear people speaking to learn meaning. One day, I can give you a sentence and ask what it means and, even without a visual, you know the meaning.”

“This research is exactly the right direction for natural language processing,” says Stefanie Tellex, a professor of computer science at Brown University who focuses on helping robots use natural language to communicate with humans. “To interpret grounded language, we need semantic representations, but it is not practicable to make it available at training time. Instead, this work captures representations of compositional structure using context from captioned videos. This is the paper I have been waiting for!”

In future work, the researchers are interested in modeling interactions, not just passive observations. “Children interact with the environment as they’re learning. Our idea is to have a model that would also use perception to learn,” Ross says.

This work was supported, in part, by the CBMM, the National Science Foundation, a Ford Foundation Graduate Research Fellowship, the Toyota Research Institute, and the MIT-IBM Brain-Inspired Multimedia Comprehension project.

Electrical properties of dendrites help explain our brain’s unique computing power

Neurons in the human brain receive electrical signals from thousands of other cells, and long neural extensions called dendrites play a critical role in incorporating all of that information so the cells can respond appropriately.

Using hard-to-obtain samples of human brain tissue, MIT neuroscientists have now discovered that human dendrites have different electrical properties from those of other species. Their studies reveal that electrical signals weaken more as they flow along human dendrites, resulting in a higher degree of electrical compartmentalization, meaning that small sections of dendrites can behave independently from the rest of the neuron.

These differences may contribute to the enhanced computing power of the human brain, the researchers say.

“It’s not just that humans are smart because we have more neurons and a larger cortex. From the bottom up, neurons behave differently,” says Mark Harnett, the Fred and Carole Middleton Career Development Assistant Professor of Brain and Cognitive Sciences. “In human neurons, there is more electrical compartmentalization, and that allows these units to be a little bit more independent, potentially leading to increased computational capabilities of single neurons.”

Harnett, who is also a member of MIT’s McGovern Institute for Brain Research, and Sydney Cash, an assistant professor of neurology at Harvard Medical School and Massachusetts General Hospital, are the senior authors of the study, which appears in the Oct. 18 issue of Cell. The paper’s lead author is Lou Beaulieu-Laroche, a graduate student in MIT’s Department of Brain and Cognitive Sciences.

Neural computation

Dendrites can be thought of as analogous to transistors in a computer, performing simple operations using electrical signals. Dendrites receive input from many other neurons and carry those signals to the cell body. If stimulated enough, a neuron fires an action potential — an electrical impulse that then stimulates other neurons. Large networks of these neurons communicate with each other to generate thoughts and behavior.

The structure of a single neuron often resembles a tree, with many branches bringing in information that arrives far from the cell body. Previous research has found that the strength of electrical signals arriving at the cell body depends, in part, on how far they travel along the dendrite to get there. As the signals propagate, they become weaker, so a signal that arrives far from the cell body has less of an impact than one that arrives near the cell body.

Dendrites in the cortex of the human brain are much longer than those in rats and most other species, because the human cortex has evolved to be much thicker than that of other species. In humans, the cortex makes up about 75 percent of the total brain volume, compared to about 30 percent in the rat brain.

Although the human cortex is two to three times thicker than that of rats, it maintains the same overall organization, consisting of six distinctive layers of neurons. Neurons from layer 5 have dendrites long enough to reach all the way to layer 1, meaning that human dendrites have had to elongate as the human brain has evolved, and electrical signals have to travel that much farther.

In the new study, the MIT team wanted to investigate how these length differences might affect dendrites’ electrical properties. They were able to compare electrical activity in rat and human dendrites, using small pieces of brain tissue removed from epilepsy patients undergoing surgical removal of part of the temporal lobe. In order to reach the diseased part of the brain, surgeons also have to take out a small chunk of the anterior temporal lobe.

With the help of MGH collaborators Cash, Matthew Frosch, Ziv Williams, and Emad Eskandar, Harnett’s lab was able to obtain samples of the anterior temporal lobe, each about the size of a fingernail.

Evidence suggests that the anterior temporal lobe is not affected by epilepsy, and the tissue appears normal when examined with neuropathological techniques, Harnett says. This part of the brain appears to be involved in a variety of functions, including language and visual processing, but is not critical to any one function; patients are able to function normally after it is removed.

Once the tissue was removed, the researchers placed it in a solution very similar to cerebrospinal fluid, with oxygen flowing through it. This allowed them to keep the tissue alive for up to 48 hours. During that time, they used a technique known as patch-clamp electrophysiology to measure how electrical signals travel along dendrites of pyramidal neurons, which are the most common type of excitatory neurons in the cortex.

These experiments were performed primarily by Beaulieu-Laroche. Harnett’s lab (and others) have previously done this kind of experiment in rodent dendrites, but his team is the first to analyze electrical properties of human dendrites.

Unique features

The researchers found that because human dendrites cover longer distances, a signal flowing along a human dendrite from layer 1 to the cell body in layer 5 is much weaker when it arrives than a signal flowing along a rat dendrite from layer 1 to layer 5.

They also showed that human and rat dendrites have the same number of ion channels, which regulate the current flow, but these channels occur at a lower density in human dendrites as a result of the dendrite elongation. They also developed a detailed biophysical model that shows that this density change can account for some of the differences in electrical activity seen between human and rat dendrites, Harnett says.

Nelson Spruston, senior director of scientific programs at the Howard Hughes Medical Institute Janelia Research Campus, described the researchers’ analysis of human dendrites as “a remarkable accomplishment.”

“These are the most carefully detailed measurements to date of the physiological properties of human neurons,” says Spruston, who was not involved in the research. “These kinds of experiments are very technically demanding, even in mice and rats, so from a technical perspective, it’s pretty amazing that they’ve done this in humans.”

The question remains, how do these differences affect human brainpower? Harnett’s hypothesis is that because of these differences, which allow more regions of a dendrite to influence the strength of an incoming signal, individual neurons can perform more complex computations on the information.

“If you have a cortical column that has a chunk of human or rodent cortex, you’re going to be able to accomplish more computations faster with the human architecture versus the rodent architecture,” he says.

There are many other differences between human neurons and those of other species, Harnett adds, making it difficult to tease out the effects of dendritic electrical properties. In future studies, he hopes to explore further the precise impact of these electrical properties, and how they interact with other unique features of human neurons to produce more computing power.

The research was funded by the National Sciences and Engineering Research Council of Canada, the Dana Foundation David Mahoney Neuroimaging Grant Program, and the National Institutes of Health.

Fujitsu Laboratories and MIT’s Center for Brains, Minds and Machines broaden partnership

Fujitsu Laboratories Ltd. and MIT’s Center for Brains, Minds and Machines (CBMM) has announced a multi-year philanthropic partnership focused on advancing the science and engineering of intelligence while supporting the next generation of researchers in this emerging field. The new commitment follows on several years of collaborative research among scientists at the two organizations.

Founded in 1968, Fujitsu Laboratories has conducted a wide range of basic and applied research in the areas of next-generation services, computer servers, networks, electronic devices, and advanced materials. CBMM, a multi-institutional, National Science Foundation funded science and technology center focusing on the interdisciplinary study of intelligence, was established in 2013 and is headquartered at MIT’s McGovern Institute for Brain Research. CBMM is also the foundation of “The Core” of the MIT Quest for Intelligence launched earlier this year. The partnership between the two organizations started in March 2017 when Fujitsu Laboratories sent a visiting scientist to CBMM.

“A fundamental understanding of how humans think, feel, and make decisions is critical to developing revolutionary technologies that will have a real impact on societal problems,” said Shigeru Sasaki, CEO of Fujitsu Laboratories. “The partnership between MIT’s Center for Brains, Minds and Machines and Fujitsu Laboratories will help advance critical R&D efforts in both human intelligence and the creation of next-generation technologies that will shape our lives,” he added.

The new Fujitsu Laboratories Co-Creation Research Fund, established with a philanthropic gift from Fujitsu Laboratories, will fuel new, innovative and challenging projects in areas of interest to both Fujitsu and CBMM, including the basic study of computations underlying visual recognition and language processing, creation of new machine learning methods, and development of the theory of deep learning. Alongside funding for research projects, Fujitsu Laboratories will also fund fellowships for graduate students attending CBMM’s summer course from 2019 to contribute to the future of research and society on a long term basis. The intensive three-week course gives advanced students from universities worldwide a “deep end” introduction to the problem of intelligence. These students will later have the opportunity to travel to Fujitsu Laboratories in Japan or its overseas locations in the U.S., Canada, U.K., Spain, and China to meet with Fujitsu researchers.

“CBMM faculty, students, and fellows are excited for the opportunity to work alongside scientists from Fujitsu to make advances in complex problems of intelligence, both real and artificial,” said CBMM’s director Tomaso Poggio, who is also an investigator at the McGovern Institute and the Eugene McDermott Professor in MIT’s Department of Brain and Cognitive Sciences. “Both Fujitsu Laboratories and MIT are committed to creating revolutionary tools and systems that will transform many industries, and to do that we are first looking to the extraordinary computations made by the human mind in everyday life.”

As part of the partnership, Poggio will be a featured keynote speaker at the Fujitsu Laboratories Advanced Technology Symposium on Oct. 9. In addition, Tomotake Sasaki, a former visiting scientist and current research affiliate in the Poggio Lab, will continue to collaborate with CBMM scientists and engineers on reinforcement learning and deep learning research projects. Moyuru Yamada, a visiting scientist in the Lab of Professor Josh Tenenbaum, is also studying the computational model of human cognition and exploring its industrial applications. Moreover, Fujitsu Laboratories is planning to invite CBMM researchers to Japan or overseas offices and arrange internships for interested students.

Model helps robots navigate more like humans do

When moving through a crowd to reach some end goal, humans can usually navigate the space safely without thinking too much. They can learn from the behavior of others and note any obstacles to avoid. Robots, on the other hand, struggle with such navigational concepts.

MIT researchers have now devised a way to help robots navigate environments more like humans do. Their novel motion-planning model lets robots determine how to reach a goal by exploring the environment, observing other agents, and exploiting what they’ve learned before in similar situations. A paper describing the model was presented at this week’s IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Popular motion-planning algorithms will create a tree of possible decisions that branches out until it finds good paths for navigation. A robot that needs to navigate a room to reach a door, for instance, will create a step-by-step search tree of possible movements and then execute the best path to the door, considering various constraints. One drawback, however, is these algorithms rarely learn: Robots can’t leverage information about how they or other agents acted previously in similar environments.

“Just like when playing chess, these decisions branch out until [the robots] find a good way to navigate. But unlike chess players, [the robots] explore what the future looks like without learning much about their environment and other agents,” says co-author Andrei Barbu, a researcher at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Center for Brains, Minds, and Machines (CBMM) within MIT’s McGovern Institute. “The thousandth time they go through the same crowd is as complicated as the first time. They’re always exploring, rarely observing, and never using what’s happened in the past.”

The researchers developed a model that combines a planning algorithm with a neural network that learns to recognize paths that could lead to the best outcome, and uses that knowledge to guide the robot’s movement in an environment.

In their paper, “Deep sequential models for sampling-based planning,” the researchers demonstrate the advantages of their model in two settings: navigating through challenging rooms with traps and narrow passages, and navigating areas while avoiding collisions with other agents. A promising real-world application is helping autonomous cars navigate intersections, where they have to quickly evaluate what others will do before merging into traffic. The researchers are currently pursuing such applications through the Toyota-CSAIL Joint Research Center.

“When humans interact with the world, we see an object we’ve interacted with before, or are in some location we’ve been to before, so we know how we’re going to act,” says Yen-Ling Kuo, a PhD student in CSAIL and first author on the paper. “The idea behind this work is to add to the search space a machine-learning model that knows from past experience how to make planning more efficient.”

Boris Katz, a principal research scientist and head of the InfoLab Group at CSAIL, is also a co-author on the paper.

Trading off exploration and exploitation

Traditional motion planners explore an environment by rapidly expanding a tree of decisions that eventually blankets an entire space. The robot then looks at the tree to find a way to reach the goal, such as a door. The researchers’ model, however, offers “a tradeoff between exploring the world and exploiting past knowledge,” Kuo says.

The learning process starts with a few examples. A robot using the model is trained on a few ways to navigate similar environments. The neural network learns what makes these examples succeed by interpreting the environment around the robot, such as the shape of the walls, the actions of other agents, and features of the goals. In short, the model “learns that when you’re stuck in an environment, and you see a doorway, it’s probably a good idea to go through the door to get out,” Barbu says.

The model combines the exploration behavior from earlier methods with this learned information. The underlying planner, called RRT*, was developed by MIT professors Sertac Karaman and Emilio Frazzoli. (It’s a variant of a widely used motion-planning algorithm known as Rapidly-exploring Random Trees, or  RRT.) The planner creates a search tree while the neural network mirrors each step and makes probabilistic predictions about where the robot should go next. When the network makes a prediction with high confidence, based on learned information, it guides the robot on a new path. If the network doesn’t have high confidence, it lets the robot explore the environment instead, like a traditional planner.

For example, the researchers demonstrated the model in a simulation known as a “bug trap,” where a 2-D robot must escape from an inner chamber through a central narrow channel and reach a location in a surrounding larger room. Blind allies on either side of the channel can get robots stuck. In this simulation, the robot was trained on a few examples of how to escape different bug traps. When faced with a new trap, it recognizes features of the trap, escapes, and continues to search for its goal in the larger room. The neural network helps the robot find the exit to the trap, identify the dead ends, and gives the robot a sense of its surroundings so it can quickly find the goal.

Results in the paper are based on the chances that a path is found after some time, total length of the path that reached a given goal, and how consistent the paths were. In both simulations, the researchers’ model more quickly plotted far shorter and consistent paths than a traditional planner.

“This model is interesting because it allows a motion planner to adapt to what it sees in the environment,” says Stephanie Tellex, an assistant professor of computer science at Brown University, who was not involved in the research. “This can enable dramatic improvements in planning speed by customizing the planner to what the robot knows. Most planners don’t adapt to the environment at all. Being able to traverse long, narrow passages is notoriously difficult for a conventional planner, but they can solve it. We need more ways that bridge this gap.”

Working with multiple agents

In one other experiment, the researchers trained and tested the model in navigating environments with multiple moving agents, which is a useful test for autonomous cars, especially navigating intersections and roundabouts. In the simulation, several agents are circling an obstacle. A robot agent must successfully navigate around the other agents, avoid collisions, and reach a goal location, such as an exit on a roundabout.

“Situations like roundabouts are hard, because they require reasoning about how others will respond to your actions, how you will then respond to theirs, what they will do next, and so on,” Barbu says. “You eventually discover your first action was wrong, because later on it will lead to a likely accident. This problem gets exponentially worse the more cars you have to contend with.”

Results indicate that the researchers’ model can capture enough information about the future behavior of the other agents (cars) to cut off the process early, while still making good decisions in navigation. This makes planning more efficient. Moreover, they only needed to train the model on a few examples of roundabouts with only a few cars. “The plans the robots make take into account what the other cars are going to do, as any human would,” Barbu says.

Going through intersections or roundabouts is one of the most challenging scenarios facing autonomous cars. This work might one day let cars learn how humans behave and how to adapt to drivers in different environments, according to the researchers. This is the focus of the Toyota-CSAIL Joint Research Center work.

“Not everybody behaves the same way, but people are very stereotypical. There are people who are shy, people who are aggressive. The model recognizes that quickly and that’s why it can plan efficiently,” Barbu says.

More recently, the researchers have been applying this work to robots with manipulators that face similarly daunting challenges when reaching for objects in ever-changing environments.

Recognizing the partially seen

When we open our eyes in the morning and take in that first scene of the day, we don’t give much thought to the fact that our brain is processing the objects within our field of view with great efficiency and that it is compensating for a lack of information about our surroundings — all in order to allow us to go about our daily functions. The glass of water you left on the nightstand when preparing for bed is now partially blocked from your line of sight by your alarm clock, yet you know that it is a glass.

This seemingly simple ability for humans to recognize partially occluded objects — defined in this situation as the effect of one object in a 3-D space blocking another object from view — has been a complicated problem for the computer vision community. Martin Schrimpf, a graduate student in the DiCarlo lab in the Department of Brain and Cognitive Sciences at MIT, explains that machines have become increasingly adept at recognizing whole items quickly and confidently, but when something covers part of that item from view, this task becomes increasingly difficult for the models to accurately recognize the article.

“For models from computer vision to function in everyday life, they need to be able to digest occluded objects just as well as whole ones — after all, when you look around, most objects are partially hidden behind another object,” says Schrimpf, co-author of a paper on the subject that was recently published in the Proceedings of the National Academy of Sciences (PNAS).

In the new study, he says, “we dug into the underlying computations in the brain and then used our findings to build computational models. By recapitulating visual processing in the human brain, we are thus hoping to also improve models in computer vision.”

How are we as humans able to repeatedly do this everyday task without putting much thought and energy into this action, identifying whole scenes quickly and accurately after injesting just pieces? Researchers in the study started with the human visual cortex as a model for how to improve the performance of machines in this setting, says Gabriel Kreiman, an affiliate of the MIT Center for Brains, Minds, and Machines. Kreinman is a professor of ophthalmology at Boston Children’s Hospital and Harvard Medical School and was lead principal investigator for the study.

In their paper, “Recurrent computations for visual pattern completion,” the team showed how they developed a computational model, inspired by physiological and anatomical constraints, that was able to capture the behavioral and neurophysiological observations during pattern completion. In the end, the model provided useful insights towards understanding how to make inferences from minimal information.

Work for this study was conducted at the Center for Brains, Minds and Machines within the McGovern Institute for Brain Research at MIT.