DiCarlo’s research goal is to reverse engineer the brain mechanisms that underlie human visual intelligence. He and his collaborators have revealed how population image transformations carried out by a deep stack of interconnected neocortical brain areas — called the primate ventral visual stream — are effortlessly able to extract object identity from visual images. His team uses a combination of large-scale neurophysiology, brain imaging, direct neural perturbation methods, and machine learning methods to build and test neurally-mechanistic computational models of the ventral visual stream and its support of cognition and behavior. Such an engineering-based understanding is likely to lead to new artificial vision and artificial intelligence approaches, new brain-machine interfaces to restore or augment lost senses, and a new foundation to ameliorate disorders of the mind.
We take for granted our ability to recognize vast numbers of objects rapidly and effortlessly, but this ability is based on a complex network of brain regions. DiCarlo is interested in how this remarkable system works. Our visual system enables us to tell within a fraction of a second whether, for example, a visual scene contains a dog, despite the fact that no two dogs are exactly alike and that the dog’s image on the retina is constantly changing depending on its location, size, pose, and illumination. Somehow, our brains create a representation of “dog-ness” that allows us to recognize an unfamiliar dog based on prior experiences with other dogs. We learn thousands of such categories in early childhood, and we continue to acquire them throughout life.
Using electrophysiological recordings from animals and neuroimaging techniques with animal and human subjects, DiCarlo is studying the patterns of brain activity that underlie our ability to recognize visual objects. In collaboration with McGovern colleague Nancy Kanwisher, DiCarlo has shown that the highest stage of this ventral stream – the inferior temporal (IT) cortex – contains clusters of neurons that respond to similar types of objects. DiCarlo has shown that the brain’s ability to recognize objects under different conditions is altered by experience. As we gain experience with visual objects, the activity of IT neurons and our perception of objects change – pointing to how the ventral stream might “learn” to represent objects in the first place. DiCarlo believes that this ventral stream transforms pixel-based images of the world into patterns of nerve activity that emphasize object identity and discount potentially confusing variables like the object’s position and size.
Jim DiCarlo joined the McGovern Institute in 2002, and he is currently the Peter de Florez Professor and chair of the MIT Department of Brain and Cognitive Sciences. He received his MD and PhD in Biomedical Engineering from Johns Hopkins University in 1998 and did his postdoctoral work at Baylor College of Medicine from 1998 to 2002. He is a past recipient of a Sloan fellowship, a Pew Scholar Award, and a McKnight Scholar Award.
Honors and Awards
MIT School of Science Prize for Excellence in Teaching, 2005
Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. Rajalingham, R., Issa, E.B., Bashivan, P., Kar, K., Schmidt, K., DiCarlo, J.J. (2018).
Journal of Neuroscience 38, 7255-7569.
Optogenetic and pharmacological suppression of spatial clusters of face neurons reveal their causal role in face gender discrimination. Afraz, A., Boyden, E.S., DiCarlo, J.J. (2015).
Proc Natl Acad Sci USA 112, 6730-6735.
Zhuang, C, Yan, S, Nayebi, A, Schrimpf, M, Frank, MC, DiCarlo, JJ et al.. Unsupervised neural network models of the ventral visual stream. Proc Natl Acad Sci U S A. 2021;118 (3):. doi: 10.1073/pnas.2014196118. PubMed PMID:33431673 PubMed Central PMC7826371.