James DiCarlo

Investigator, McGovern Institute
Peter de Florez Professor and Department Head, Brain and Cognitive Sciences
MIT Address: 

Making sense of the visual world

James DiCarlo examines the complex network of brain regions that allow us to recognize vast numbers of objects rapidly and effortlessly. His current focus is on a series of successive brain areas, known as the ventral visual processing stream, that is of special importance for object recognition. DiCarlo also develops computational models of the brain with the ultimate goal of building a computer simulation of the brain's capacity for object recognition–such models could provide insights into the sensory deficits that occur after stroke or brain injury.

Rapid Recognition

We take for granted our ability to recognize vast numbers of objects rapidly and effortlessly, but this ability is based on a complex network of brain regions. DiCarlo is interested in how this remarkable system works. Our visual system enables us to tell within a fraction of a second whether, for example, a visual scene contains a dog, despite the fact that no two dogs are exactly alike and that the dog's image on the retina is constantly changing depending on its location, size, pose, and illumination. Somehow, our brains create a representation of 'dog-ness' that allows us to recognize an unfamiliar dog based on prior experiences with other dogs. We learn thousands of such categories in early childhood, and we continue to acquire them throughout life.

Using electrophysiological recordings from animals and neuroimaging techniques with animal and human subjects, DiCarlo is studying the patterns of brain activity that underlie our ability to recognize visual objects. In collaboration with McGovern colleague Nancy Kanwisher, DiCarlo has shown that the highest stage of this ventral stream -- the inferior temporal (IT) cortex -- contains clusters of neurons that respond to similar types of objects. DiCarlo has shown that the brain's ability to recognize objects under different conditions is altered by experience. As we gain experience with visual objects, the activity of IT neurons and our perception of objects change -- pointing to how the ventral stream might 'learn' to represent objects in the first place. DiCarlo believes that this ventral stream transforms pixel-based images of the world into patterns of nerve activity that emphasize object identity and discount potentially confusing variables like the object's position and size.

DiCarlo uses these computer-generated objects, called smoothies, spikies, and cubies to test theories of visual object recognition. They are designed to be unfamiliar (they do not resemble anything in the real world) and to be variable in ways that are easily measured. After practicing on one specific class, humans and monkeys become more proficient at distinguishing individual examples of that class relative to the two other classes.

Modeling the brain

In addition to his experimental studies with humans and animals, DiCarlo develops computational models of the brain in collaboration with McGovern colleague Tomaso Poggio. Together, they aim to generate predictions that can be tested by further experiments, with the ultimate goal of building a computer simulaiton of the brain's capacity for object recognition. DiCarlo expects that this will also provide new insights into the sensory deficits that occur after stroke or injury, and may even point the way to new strategies such as neural prosthesis to restore lost senses.


Jim DiCarlo joined the McGovern Institute in 2002, and he is currently the Peter de Florez Professor and head of the MIT Department of Brain and Cognitive Sciences. He received his M.D. and Ph.D. in Biomedical Engineering from Johns Hopkins University in 1998 and did his postdoctoral work at Baylor College of Medicine from 1998 to 2002. He is a past recipient of a Sloan fellowship, a Pew Scholar Award and a McKnight Scholar Award.