Ila Fiete studies how the brain performs complex computations

While doing a postdoc about 15 years ago, Ila Fiete began searching for faculty jobs in computational neuroscience — a field that uses mathematical tools to investigate brain function. However, there were no advertised positions in theoretical or computational neuroscience at that time in the United States.

“It wasn’t really a field,” she recalls. “That has changed completely, and [now] there are 15 to 20 openings advertised per year.” She ended up finding a position in the Center for Learning and Memory at the University of Texas at Austin, which along with a small handful of universities including MIT, was open to neurobiologists with a computational background.

Computation is the cornerstone of Fiete’s research at MIT’s McGovern Institute for Brain Research, where she has been a faculty member since 2018. Using computational and mathematical techniques, she studies how the brain encodes information in ways that enable cognitive tasks such as learning, memory, and reasoning about our surroundings.

One major research area in Fiete’s lab is how the brain is able to continuously compute the body’s position in space and make constant adjustments to that estimate as we move about.

“When we walk through the world, we can close our eyes and still have a pretty good estimate of where we are,” she says. “This involves being able to update our estimate based on our sense of self-motion. There are also many computations in the brain that involve moving through abstract or mental rather than physical space, and integrating velocity signals of some variety or another. Some of the same ideas and even circuits for spatial navigation might be involved in navigating through these mental spaces.”

No better fit

Fiete spent her childhood between Mumbai, India, and the United States, where her mathematician father held a series of visiting or permanent appointments at the Institute for Advanced Study in Princeton, NJ, the University of California at Berkeley, and the University of Michigan at Ann Arbor.

In India, Fiete’s father did research at the Tata Institute of Fundamental Research, and she grew up spending time with many other children of academics. She was always interested in biology, but also enjoyed math, following in her father’s footsteps.

“My father was not a hands-on parent, wanting to teach me a lot of mathematics, or even asking me about how my math schoolwork was going, but the influence was definitely there. There’s a certain aesthetic to thinking mathematically, which I absorbed very indirectly,” she says. “My parents did not push me into academics, but I couldn’t help but be influenced by the environment.”

She spent her last two years of high school in Ann Arbor and then went to the University of Michigan, where she majored in math and physics. While there, she worked on undergraduate research projects, including two summer stints at Indiana University and the University of Virginia, which gave her firsthand experience in physics research. Those projects covered a range of topics, including proton radiation therapy, the magnetic properties of single crystal materials, and low-temperature physics.

“Those three experiences are what really made me sure that I wanted to go into academics,” Fiete says. “It definitely seemed like the path that I knew the best, and I think it also best suited my temperament. Even now, with more exposure to other fields, I cannot think of a better fit.”

Although she was still interested in biology, she took only one course in the subject in college, holding back because she did not know how to marry quantitative approaches with biological sciences. She began her graduate studies at Harvard University planning to study low-temperature physics, but while there, she decided to start explore quantitative classes in biology. One of those was a systems biology course taught by then-MIT professor Sebastian Seung, which transformed her career trajectory.

“It was truly inspiring,” she recalls. “Thinking mathematically about interacting systems in biology was really exciting. It was really my first introduction to systems biology, and it had me hooked immediately.”

She ended up doing most of her PhD research in Seung’s lab at MIT, where she studied how the brain uses incoming signals of the velocity of head movement to control eye position. For example, if we want to keep our gaze fixed on a particular location while our head is moving, the brain must continuously calculate and adjust the amount of tension needed in the muscles surrounding the eyes, to compensate for the movement of the head.

“Bizarre” cells

After earning her PhD, Fiete and her husband, a theoretical physicist, went to the Kavli Institute for Theoretical Physics at the University of California at Santa Barbara, where they each held fellowships for independent research. While there, Fiete began working on a research topic that she still studies today — grid cells. These cells, located in the entorhinal cortex of the brain, enable us to navigate our surroundings by helping the brain to create a neural representation of space.

Midway through her position there, she learned of a new discovery, that when a rat moves across an open room, a grid cell in its brain fires at many different locations arranged geometrically in a regular pattern of repeating triangles. Together, a population of grid cells forms a lattice of triangles representing the entire room. These cells have also been found in the brains of various other mammals, including humans.

“It’s amazing. It’s this very crystalline response,” Fiete says. “When I read about that, I fell out of my chair. At that point I knew this was something bizarre that would generate so many questions about development, function, and brain circuitry that could be studied computationally.”

One question Fiete and others have investigated is why the brain needs grid cells at all, since it also has so-called place cells that each fire in one specific location in the environment. A possible explanation that Fiete has explored is that grid cells of different scales, working together, can represent a vast number of possible positions in space and also multiple dimensions of space.

“If you have a few cells that can parsimoniously generate a very large coding space, then you can afford to not use most of that coding space,” she says. “You can afford to waste most of it, which means you can separate things out very well, in which case it becomes not so susceptible to noise.”

Since returning to MIT, she has also pursued a research theme related to what she explored in her PhD thesis — how the brain maintains neural representations of where the head is located in space. In a paper published last year, she uncovered that the brain generates a one-dimensional ring of neural activity that acts as a compass, allowing the brain to calculate the current direction of the head relative to the external world.

Her lab also studies cognitive flexibility — the brain’s ability to perform so many different types of cognitive tasks.

“How it is that we can repurpose the same circuits and flexibly use them to solve many different problems, and what are the neural codes that are amenable to that kind of reuse?” she says. “We’re also investigating the principles that allow the brain to hook multiple circuits together to solve new problems without a lot of reconfiguration.”

Looking into the black box of deep learning networks

Deep learning systems are revolutionizing technology around us, from voice recognition that pairs you with your phone to autonomous vehicles that are increasingly able to see and recognize obstacles ahead. But much of this success involves trial and error when it comes to the deep learning networks themselves. A group of MIT researchers recently reviewed their contributions to a better theoretical understanding of deep learning networks, providing direction for the field moving forward.

“Deep learning was in some ways an accidental discovery,” explains Tomaso Poggio, investigator at the McGovern Institute for Brain Research, director of the Center for Brains, Minds, and Machines (CBMM), and the Eugene McDermott Professor in Brain and Cognitive Sciences. “We still do not understand why it works. A theoretical framework is taking form, and I believe that we are now close to a satisfactory theory. It is time to stand back and review recent insights.”

Climbing data mountains

Our current era is marked by a superabundance of data — data from inexpensive sensors of all types, text, the internet, and large amounts of genomic data being generated in the life sciences. Computers nowadays ingest these multidimensional datasets, creating a set of problems dubbed the “curse of dimensionality” by the late mathematician Richard Bellman.

One of these problems is that representing a smooth, high-dimensional function requires an astronomically large number of parameters. We know that deep neural networks are particularly good at learning how to represent, or approximate, such complex data, but why? Understanding why could potentially help advance deep learning applications.

“Deep learning is like electricity after Volta discovered the battery, but before Maxwell,” explains Poggio.

“Useful applications were certainly possible after Volta, but it was Maxwell’s theory of electromagnetism, this deeper understanding that then opened the way to the radio, the TV, the radar, the transistor, the computers, and the internet,” says Poggio, who is the founding scientific advisor of The Core, MIT Quest for Intelligence, and an investigator in the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT.

The theoretical treatment by Poggio, Andrzej Banburski, and Qianli Liao points to why deep learning might overcome data problems such as “the curse of dimensionality.” Their approach starts with the observation that many natural structures are hierarchical. To model the growth and development of a tree doesn’t require that we specify the location of every twig. Instead, a model can use local rules to drive branching hierarchically. The primate visual system appears to do something similar when processing complex data. When we look at natural images — including trees, cats, and faces — the brain successively integrates local image patches, then small collections of patches, and then collections of collections of patches.

“The physical world is compositional — in other words, composed of many local physical interactions,” explains Qianli Liao, an author of the study, and a graduate student in the Department of Electrical Engineering and Computer Science and a member of the CBMM. “This goes beyond images. Language and our thoughts are compositional, and even our nervous system is compositional in terms of how neurons connect with each other. Our review explains theoretically why deep networks are so good at representing this complexity.”

The intuition is that a hierarchical neural network should be better at approximating a compositional function than a single “layer” of neurons, even if the total number of neurons is the same. The technical part of their work identifies what “better at approximating” means and proves that the intuition is correct.

Generalization puzzle

There is a second puzzle about what is sometimes called the unreasonable effectiveness of deep networks. Deep network models often have far more parameters than data to fit them, despite the mountains of data we produce these days. This situation ought to lead to what is called “overfitting,” where your current data fit the model well, but any new data fit the model terribly. This is dubbed poor generalization in conventional models. The conventional solution is to constrain some aspect of the fitting procedure. However, deep networks do not seem to require this constraint. Poggio and his colleagues prove that, in many cases, the process of training a deep network implicitly “regularizes” the solution, providing constraints.

The work has a number of implications going forward. Though deep learning is actively being applied in the world, this has so far occurred without a comprehensive underlying theory. A theory of deep learning that explains why and how deep networks work, and what their limitations are, will likely allow development of even much more powerful learning approaches.

“In the long term, the ability to develop and build better intelligent machines will be essential to any technology-based economy,” explains Poggio. “After all, even in its current — still highly imperfect — state, deep learning is impacting, or about to impact, just about every aspect of our society and life.”

Nine MIT School of Science professors receive tenure for 2020

Beginning July 1, nine faculty members in the MIT School of Science have been granted tenure by MIT. They are appointed in the departments of Brain and Cognitive Sciences, Chemistry, Mathematics, and Physics.

Physicist Ibrahim Cisse investigates living cells to reveal and study collective behaviors and biomolecular phase transitions at the resolution of single molecules. The results of his work help determine how disruptions in genes can cause diseases like cancer. Cisse joined the Department of Physics in 2014 and now holds a joint appointment with the Department of Biology. His education includes a bachelor’s degree in physics from North Carolina Central University, concluded in 2004, and a doctoral degree in physics from the University of Illinois at Urbana-Champaign, achieved in 2009. He followed his PhD with a postdoc at the École Normale Supérieure of Paris and a research specialist appointment at the Howard Hughes Medical Institute’s Janelia Research Campus.

Jörn Dunkel is a physical applied mathematician. His research focuses on the mathematical description of complex nonlinear phenomena in a variety of fields, especially biophysics. The models he develops help predict dynamical behaviors and structure formation processes in developmental biology, fluid dynamics, and even knot strengths for sailing, rock climbing and construction. He joined the Department of Mathematics in 2013 after completing postdoctoral appointments at Oxford University and Cambridge University. He received diplomas in physics and mathematics from Humboldt University of Berlin in 2004 and 2005, respectively. The University of Augsburg awarded Dunkel a PhD in statistical physics in 2008.

A cognitive neuroscientist, Mehrdad Jazayeri studies the neurobiological underpinnings of mental functions such as planning, inference, and learning by analyzing brain signals in the lab and using theoretical and computational models, including artificial neural networks. He joined the Department of Brain and Cognitive Sciences in 2013. He achieved a BS in electrical engineering from the Sharif University of Technology in 1994, an MS in physiology at the University of Toronto in 2001, and a PhD in neuroscience from New York University in 2007. Prior to joining MIT, he was a postdoc at the University of Washington. Jazayeri is also an investigator at the McGovern Institute for Brain Research.

Yen-Jie Lee is an experimental particle physicist in the field of proton-proton and heavy-ion physics. Utilizing the Large Hadron Colliders, Lee explores matter in extreme conditions, providing new insight into strong interactions and what might have existed and occurred at the beginning of the universe and in distant star cores. His work on jets and heavy flavor particle production in nuclei collisions improves understanding of the quark-gluon plasma, predicted by quantum chromodynamics (QCD) calculations, and the structure of heavy nuclei. He also pioneered studies of high-density QCD with electron-position annihilation data. Lee joined the Department of Physics in 2013 after a fellowship at CERN and postdoc research at the Laboratory for Nuclear Science at MIT. His bachelor’s and master’s degrees were awarded by the National Taiwan University in 2002 and 2004, respectively, and his doctoral degree by MIT in 2011. Lee is a member of the Laboratory for Nuclear Science.

Josh McDermott investigates the sense of hearing. His research addresses both human and machine audition using tools from experimental psychology, engineering, and neuroscience. McDermott hopes to better understand the neural computation underlying human hearing, to improve devices to assist hearing impaired, and to enhance machine interpretation of sounds. Prior to joining MIT’s Department of Brain and Cognitive Sciences, he was awarded a BA in 1998 in brain and cognitive sciences by Harvard University, a master’s degree in computational neuroscience in 2000 by University College London, and a PhD in brain and cognitive sciences in 2006 by MIT. Between his doctoral time at MIT and returning as a faculty member, he was a postdoc at the University of Minnesota and New York University, and a visiting scientist at Oxford University. McDermott is also an associate investigator at the McGovern Institute for Brain Research and an investigator in the Center for Brains, Minds and Machines.

Solving environmental challenges by studying and manipulating chemical reactions is the focus of Yogesh Surendranath’s research. Using chemistry, he works at the molecular level to understand how to efficiently interconvert chemical and electrical energy. His fundamental studies aim to improve energy storage technologies, such as batteries, fuel cells, and electrolyzers, that can be used to meet future energy demand with reduced carbon emissions. Surendranath joined the Department of Chemistry in 2013 after a postdoc at the University of California at Berkeley. His PhD was completed in 2011 at MIT, and BS in 2006 at the University of Virginia. Suendranath is also a collaborator in the MIT Energy Initiative.

A theoretical astrophysicist, Mark Vogelsberger is interested in large-scale structures of the universe, such as galaxy formation. He combines observational data, theoretical models, and simulations that require high-performance supercomputers to improve and develop detailed models that simulate galaxy diversity, clustering, and their properties, including a plethora of physical effects like magnetic fields, cosmic dust, and thermal conduction. Vogelsberger also uses simulations to generate scenarios involving alternative forms of dark matter. He joined the Department of Physics in 2014 after a postdoc at the Harvard-Smithsonian Center for Astrophysics. Vogelsberger is a 2006 graduate of the University of Mainz undergraduate program in physics, and a 2010 doctoral graduate of the University of Munich and the Max Plank Institute for Astrophysics. He is also a principal investigator in the MIT Kavli Institute for Astrophysics and Space Research.

Adam Willard is a theoretical chemist with research interests that fall across molecular biology, renewable energy, and material science. He uses theory, modeling, and molecular simulation to study the disorder that is inherent to systems over nanometer-length scales. His recent work has highlighted the fundamental and unexpected role that such disorder plays in phenomena such as microscopic energy transport in semiconducting plastics, ion transport in batteries, and protein hydration. Joining the Department of Chemistry in 2013, Willard was formerly a postdoc at Lawrence Berkeley National Laboratory and then the University of Texas at Austin. He holds a PhD in chemistry from the University of California at Berkeley, achieved in 2009, and a BS in chemistry and mathematics from the University of Puget Sound, granted in 2003.

Lindley Winslow seeks to understand the fundamental particles shaped the evolution of our universe. As an experimental particle and nuclear physicist, she develops novel detection technology to search for axion dark matter and a proposed nuclear decay that makes more matter than antimatter. She started her faculty position in the Department of Physics in 2015 following a postdoc at MIT and a subsequent faculty position at the University of California at Los Angeles. Winslow achieved her BA in physics and astronomy in 2001 and PhD in physics in 2008, both at the University of California at Berkeley. She is also a member of the Laboratory for Nuclear Science.

Family members unite to fight COVID-19

Even before MIT sent out its first official announcement about the COVID-19 crisis, I had already asked permission from my supervisor and taken my computer home so that I could start working from home.

My first and foremost concern was my family and friends. I was born and brought up in India, and then immigrated to Canada, so I have a big and wonderful family spread across both those countries. These countries had a lower number of COVID-19 cases at the time, but I could see what would be coming their way. I was anxious, very anxious. In India, my dad being an anesthetist could be exposed while working in the hospital. In Canada, my uncle who is a physician could be exposed, and on top of that he lives in the same house as my grandparents who are even more vulnerable due to their age. I knew I had to do something.

We started having regular video calls as a family. My mom even led daily online yoga sessions, and the discussions that followed those sessions ensured that we didn’t feel lonely and gave us a sense of purpose. Together, we looked at the statistics in the data from China and Italy, and learned that we needed to flatten the curve due to the lack of medical resources required to meet the need of the hour. We could foresee that more infections would lead to more patients, thus raising the demand for medical resources beyond the amount we had available.

We had several discussions around developing products for helping medical professionals and the general public during this pandemic.

We learned that since no government has enough resources to cope at the time of pandemics, we have to be innovative in trying to make the best use of the limited resources available to us.

Through our discussions and experiences of some of us in the field, we came to the conclusion that the only way to effectively fight COVID-19 is prevention at source. Hence, we started working on a mobile app that uses AI and advanced data analytics to trace contact, determine the risk of infection, and thereby suggest precautions. Luckily we have engineers and computer scientists in our family (my own background is in electrical engineering), so it was easy for us to divide the work.  In our prototype, when people sign-up, they are asked to fill out a short self-assessment form that can be used to identify any symptoms of COVID-19. This data is then used to predict vulnerable areas and to give recommendations to people who might have taken a certain route as shown below.

Sharma’s mobile app showing heatmap of the vulnerable areas in a locality in Toronto, ON (left) and personalized recommendations based on the most recent route taken by an individual (right).

We ended up submitting our proposal and prototype to the COVID-19 challenge launched by Vale (a global mining company) and the winners will be announced in May.

Personally, to be completely honest, I had my times when I broke down due to everything that was going on in the world around me. It’s not easy to see people dying, and losing jobs. My way of staying strong was to make sure that I was doing my best to contribute.

I have set up a beautiful home office for myself and I am focusing on my PhD research, being grateful that I can still continue to do it from home. I have also restarted the joint MIT-Harvard computational neuroscience journal club meetings online, so that members can get access to this wonderful community once again! It was amazing to see from a poll we conducted that 92% of the members of the club wanted the meetings to be re-started online.

These times are unprecedented for my generation, my mom’s generation and even for my grandmother’s generation. I have never seen the world come together in a way I have seen during this pandemic. The kind of response we have seen from our societies and governments across the globe shows that we can make intelligent decisions for the collective good of humanity. For once, we’re all on the same side!


Sugandha (Su) Sharma is a graduate student in the labs of Ila Fiete and Josh Tenenbaum. When she’s not developing a mobile app to fight COVID-19, Su explores the computational and theoretical principles underlying higher level cognition and intelligence in the human brain.

#WeAreMcGovern

McGovern lab manager creates art inspired by science

Michal De-Medonsa, technical associate and manager of the Jazayeri lab, created a large wood mosaic for her lab. We asked Michal to tell us a bit about the mosaic, her inspiration, and how in the world she found the time to create such an exquisitely detailed piece of art.

______

Jazayeri lab manager Michal De-Medonsa holds her wood mosaic entitled “JazLab.” Photo: Caitlin Cunningham

Describe this piece of art for us.

To make a piece this big (63″ x 15″), I needed several boards of padauk wood. I could have just etched each board as a whole unit and glued the 13 or so boards to each other, but I didn’t like the aesthetic. The grain and color within each board would look beautiful, but the line between each board would become obvious, segmented, and jarring when contrasted with the uniformity within each board. Instead, I cut out about 18 separate squares out of each board, shuffled all 217 pieces around, and glued them to one another in a mosaic style with a larger pattern (inspired by my grandfather’s work in granite mosaics).

What does this mosaic mean to you?

Once every piece was shuffled, the lines between single squares were certainly visible, but as a feature, were far less salient than had the full boards been glued to one another. As I was working on the piece, I was thinking about how the same concept holds true in society. Even if there is diversity within a larger piece (an institution, for example), there is a tendency for groups to form within the larger piece (like a full board), diversity becomes separated. This isn’t a criticism of any institution, it is human nature to form in-groups. It’s subconscious (so perhaps the criticism is that we, as a society, don’t give that behavior enough thought and try to ameliorate our reflex to group with those who are “like us”). The grain of the wood is uniform, oriented in the same direction, the two different cutting patterns create a larger pattern within the piece, and there are smaller patterns between and within single pieces. I love creating and finding patterns in my art (and life). Alfred North Whitehead wrote that “understanding is the apperception of pattern as such.” True, I believe, in science, art, and the humanities. What a great goal – to understand.​

Tell us about the name of this piece.

Every large piece I make is inspired by the people I make it for, and is therefore named after them. This piece is called JazLab. Having lived around the world, and being a descendant of a nomadic people, I don’t consider any one place home, but am inspired by every place I’ve lived. In all of my work, you can see elements of my Jewish heritage, antiquity, the Middle East, Africa, and now MIT.

How has MIT influenced your art?

MIT has influenced me in the most obvious way MIT could influence anyone – technology. Before this series, I made very small versions of this type of work, designing everything on a piece of paper with a pencil and a ruler, and making every cut by hand. Each of those small squares would take ~2 hours (depending on the design), and I was limited to softer woods.

Since coming to MIT, I learned that I had access to the Hobby Shop with a huge array of power tools and software. I began designing my patterns on the computer and used power tools to make the cuts. I actually struggled a lot with using the tech – not because it was hard (which, it really is when you just start out), but rather because it felt like I was somehow “cheating.” How is this still art? And although this is something I still think about often, I’ve tried to look at it in this way: every generation, in their time, used the most advanced technology. The beauty and value of the piece doesn’t come from how many bruises, cuts, and blisters your machinery gave you, or whether you scraped the wood out with your nails, but rather, once you were given a tool, what did you decide to do with it? My pieces still have a huge hand-on-material work, but I am working on accepting that using technology in no way devalues the work.

Given your busy schedule with the Jazayeri lab, how did you find the time to create this piece of art?

I took advantage of any free hour I could. Two days out of the week, the hobby shop is open until 9pm, and I would additionally go every Saturday. For the parts that didn’t require the shop (adjusting each piece individually with a carving knife, assembling them, even most of the glueing) I would just work  at home – often very late into the night.

______

JazLab is on display in the Jazayeri lab in MIT Bldg 46.

Tidying up deep neural networks

Visual art has found many ways of representing objects, from the ornate Baroque period to modernist simplicity. Artificial visual systems are somewhat analogous: from relatively simple beginnings inspired by key regions in the visual cortex, recent advances in performance have seen increasing complexity.

“Our overall goal has been to build an accurate, engineering-level model of the visual system, to ‘reverse engineer’ visual intelligence,” explains James DiCarlo, the head of MIT’s Department of Brain and Cognitive Sciences, an investigator in the McGovern Institute for Brain Research and the Center for Brains, Minds, and Machines (CBMM). “But very high-performing ANNs have started to drift away from brain architecture, with complex branching architectures that have no clear parallel in the brain.”

A new model from the DiCarlo lab has re-imposed a brain-like architecture on an object recognition network. The result is a shallow-network architecture with surprisingly high performance, indicating that we can simplify deeper– and more baroque– networks yet retain high performance in artificial learning systems.

“We’ve made two major advances,” explains graduate student Martin Schrimpf, who led the work with Jonas Kubilius at CBMM. “We’ve found a way of checking how well models match the brain, called Brain-Score, and developed a model, CORnet, that moves artificial object recognition, as well as machine learning architectures, forward.”

DiCarlo lab graduate student Martin Schrimpf in the lab. Photo: Kris Brewer

Back to the brain

Deep convolutional artificial neural networks were initially inspired by brain anatomy, and are the leading models in artificial object recognition. Training these feedforward systems on recognizing objects in ImageNet, a large database of images, has allowed performance of ANNs to vastly improve, but at the same time networks have literally branched out, become increasingly complex with hundreds of layers. In contrast, the visual ventral stream, a series of cortical brain regions that unpack object identity, contains a relatively minuscule four key regions. In addition, ANNs are entirely feedforward, while the primate cortical visual system has densely interconnected wiring, in other words, recurrent connectivity. While primate-like object recognition capabilities can be captured through feedforward-only networks, recurrent wiring in the brain has long been suspected, and recently shown in two DiCarlo lab papers led by Kar and Tang respectively, to be important.

DiCarlo and colleagues have now developed CORnet-S, inspired by very complex, state-of-the-art neural networks. CORnet-S has four computational areas, analogous to cortical visual areas (V1, V2, V4, and IT). In addition, CORnet-S contains repeated, or recurrent, connections.

“We really pre-defined layers in the ANN, defining V1, V2, and so on, and introduced feedback and repeated connections” explains Schrimpf. “As a result, we ended up with fewer layers, and less ‘dead space’ that cannot be mapped to the brain. In short, a simpler network.”

Keeping score

To optimize the system, the researchers incorporated quantitative assessment through a new system, Brain-Score.

“Until now, we’ve needed to qualitatively eyeball model performance relative to the brain,” says Schrimpf. “Brain-Score allows us to actually quantitatively evaluate and benchmark models.”

They found that CORnet-S ranks highly on Brain-Score, and is the best performer of all shallow ANNs. Indeed, the system, shallow as it is, rivals the complex, ultra-deep ANNs that currently perform at the highest level.

CORnet was also benchmarked against human performance. To test, for example, whether the system can predict human behavior, 1,472 humans were shown images for 100ms and then asked to identify objects in them. CORnet-S was able to predict the general accuracy of humans to make calls about what they had briefly glimpsed (bear vs. dog etc.). Indeed, CORnet-S is able to predict the behavior, as well as the neural dynamics, of the visual ventral stream, indicating that it is modeling primate-like behavior.

“We thought we’d lose performance by going to a wide, shallow network, but with recurrence, we hardly lost any,” says Schrimpf, “the message for machine learning more broadly, is you can get away without really deep networks.”

Such models of brain processing have benefits for both neuroscience and artificial systems, helping us to understand the elements of image processing by the brain. Neuroscience in turn informs us that features such as recurrence, can be used to improve performance in shallow networks, an important message for artificial intelligence systems more broadly.

“There are clear advantages to the high performing, complex deep networks,” explains DiCarlo, “but it’s possible to rein the network in, using the elegance of the primate brain as a model, and we think this will ultimately lead to other kinds of advantages.”

Differences between deep neural networks and human perception

When your mother calls your name, you know it’s her voice — no matter the volume, even over a poor cell phone connection. And when you see her face, you know it’s hers — if she is far away, if the lighting is poor, or if you are on a bad FaceTime call. This robustness to variation is a hallmark of human perception. On the other hand, we are susceptible to illusions: We might fail to distinguish between sounds or images that are, in fact, different. Scientists have explained many of these illusions, but we lack a full understanding of the invariances in our auditory and visual systems.

Deep neural networks also have performed speech recognition and image classification tasks with impressive robustness to variations in the auditory or visual stimuli. But are the invariances learned by these models similar to the invariances learned by human perceptual systems? A group of MIT researchers has discovered that they are different. They presented their findings yesterday at the 2019 Conference on Neural Information Processing Systems.

The researchers made a novel generalization of a classical concept: “metamers” — physically distinct stimuli that generate the same perceptual effect. The most famous examples of metamer stimuli arise because most people have three different types of cones in their retinae, which are responsible for color vision. The perceived color of any single wavelength of light can be matched exactly by a particular combination of three lights of different colors — for example, red, green, and blue lights. Nineteenth-century scientists inferred from this observation that humans have three different types of bright-light detectors in our eyes. This is the basis for electronic color displays on all of the screens we stare at every day. Another example in the visual system is that when we fix our gaze on an object, we may perceive surrounding visual scenes that differ at the periphery as identical. In the auditory domain, something analogous can be observed. For example, the “textural” sound of two swarms of insects might be indistinguishable, despite differing in the acoustic details that compose them, because they have similar aggregate statistical properties. In each case, the metamers provide insight into the mechanisms of perception, and constrain models of the human visual or auditory systems.

In the current work, the researchers randomly chose natural images and sound clips of spoken words from standard databases, and then synthesized sounds and images so that deep neural networks would sort them into the same classes as their natural counterparts. That is, they generated physically distinct stimuli that are classified identically by models, rather than by humans. This is a new way to think about metamers, generalizing the concept to swap the role of computer models for human perceivers. They therefore called these synthesized stimuli “model metamers” of the paired natural stimuli. The researchers then tested whether humans could identify the words and images.

“Participants heard a short segment of speech and had to identify from a list of words which word was in the middle of the clip. For the natural audio this task is easy, but for many of the model metamers humans had a hard time recognizing the sound,” explains first-author Jenelle Feather, a graduate student in the MIT Department of Brain and Cognitive Sciences (BCS) and a member of the Center for Brains, Minds, and Machines (CBMM). That is, humans would not put the synthetic stimuli in the same class as the spoken word “bird” or the image of a bird. In fact, model metamers generated to match the responses of the deepest layers of the model were generally unrecognizable as words or images by human subjects.

Josh McDermott, associate professor in BCS and investigator in CBMM, makes the following case: “The basic logic is that if we have a good model of human perception, say of speech recognition, then if we pick two sounds that the model says are the same and present these two sounds to a human listener, that human should also say that the two sounds are the same. If the human listener instead perceives the stimuli to be different, this is a clear indication that the representations in our model do not match those of human perception.”

Joining Feather and McDermott on the paper are Alex Durango, a post-baccalaureate student, and Ray Gonzalez, a research assistant, both in BCS.

There is another type of failure of deep networks that has received a lot of attention in the media: adversarial examples (see, for example, “Why did my classifier just mistake a turtle for a rifle?“). These are stimuli that appear similar to humans but are misclassified by a model network (by design — they are constructed to be misclassified). They are complementary to the stimuli generated by Feather’s group, which sound or appear different to humans but are designed to be co-classified by the model network. The vulnerabilities of model networks exposed to adversarial attacks are well-known — face-recognition software might mistake identities; automated vehicles might not recognize pedestrians.

The importance of this work lies in improving models of perception beyond deep networks. Although the standard adversarial examples indicate differences between deep networks and human perceptual systems, the new stimuli generated by the McDermott group arguably represent a more fundamental model failure — they show that generic examples of stimuli classified as the same by a deep network produce wildly different percepts for humans.

The team also figured out ways to modify the model networks to yield metamers that were more plausible sounds and images to humans. As McDermott says, “This gives us hope that we may be able to eventually develop models that pass the metamer test and better capture human invariances.”

“Model metamers demonstrate a significant failure of present-day neural networks to match the invariances in the human visual and auditory systems,” says Feather, “We hope that this work will provide a useful behavioral measuring stick to improve model representations and create better models of human sensory systems.”

Brain science in the Bolivian rainforest

Malinda McPherson headshot
Graduate student Malinda McPherson. Photo: Caitlin Cunningham

Malinda McPherson is a graduate student in Josh McDermott‘s lab, studying how people hear pitch (how high or low a sound is) in both speech and music.

To test the extent to which human audition varies across cultures, McPherson travels with the McDermott lab to Bolivia to study the Tsimane’ — a native Amazonian society with minimal exposure to Western culture.

Their most recent study, published in the journal Current Biology, found a striking variation in perception of musical pitch across cultures.

In this Q&A, we ask McPherson what motivates her research and to describe some of the challenges she has experienced working in the Bolivian rainforest. 

What are you working on now?

Right now, I’m particularly excited about a project that involves working with children; we are trying to better understand how the ability to hear pitch develops with age and experience. Difficulty hearing pitch is one of the first issues that most people with poor or corrected hearing find discouraging, so in addition to simply being an interesting basic component of audition, understanding how pitch perception develops may be useful in engineering assistive hearing devices.

How has your personal background inspired your research?

I’ve been an avid violist for over twenty years and still perform with the Chamber Music Society at MIT. When I was an undergraduate and deciding between a career as a professional musician and a career in science, I found a way to merge the two by working as a research assistant in a lab studying musical creativity. I worked in that lab for three years and was completely hooked. My musical training has definitely helped me design a few experiments!

What was your most challenging experience in Bolivia?  Most rewarding?

The most challenging aspect of our fieldwork in Bolivia is sustaining our intensity over a period of 4-5 weeks.  Every moment is precious, and the pace of work is both exhilarating and exhausting. Despite the long hours of work and travel (by canoe or by truck over very bumpy roads), it is an incredible privilege to meet with and to learn from the Tsimane’. I’ve been picking up some Tsimane’ phrases from the translators with whom we work, and can now have basic conversations with participants and make kids laugh, so that’s a lot of fun. A few children I met my first year greeted me by name when we went back this past year. That was a very special moment!

Translator Manuel Roca Moye (left) with Malinda McPherson and Josh McDermott in a fully loaded canoe. Photo: McDermott lab

What single scientific question do you hope to answer?

I’d be curious to figure out the overlaps and distinctions between how we perceive music versus speech, but I think one of the best aspects of science is that many of the important future questions haven’t been thought of yet!

Perception of musical pitch varies across cultures

People who are accustomed to listening to Western music, which is based on a system of notes organized in octaves, can usually perceive the similarity between notes that are same but played in different registers — say, high C and middle C. However, a longstanding question is whether this a universal phenomenon or one that has been ingrained by musical exposure.

This question has been hard to answer, in part because of the difficulty in finding people who have not been exposed to Western music. Now, a new study led by researchers from MIT and the Max Planck Institute for Empirical Aesthetics has found that unlike residents of the United States, people living in a remote area of the Bolivian rainforest usually do not perceive the similarities between two versions of the same note played at different registers (high or low).

“We’re finding that … there seems to be really striking variation in things that a lot of people would have presumed would be common across cultures and listeners,” says McDermott.

The findings suggest that although there is a natural mathematical relationship between the frequencies of every “C,” no matter what octave it’s played in, the brain only becomes attuned to those similarities after hearing music based on octaves, says Josh McDermott, an associate professor in MIT’s Department of Brain and Cognitive Sciences.

“It may well be that there is a biological predisposition to favor octave relationships, but it doesn’t seem to be realized unless you are exposed to music in an octave-based system,” says McDermott, who is also a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds and Machines.

The study also found that members of the Bolivian tribe, known as the Tsimane’, and Westerners do have a very similar upper limit on the frequency of notes that they can accurately distinguish, suggesting that that aspect of pitch perception may be independent of musical experience and biologically determined.

McDermott is the senior author of the study, which appears in the journal Current Biology on Sept. 19. Nori Jacoby, a former MIT postdoc who is now a group leader at the Max Planck Institute for Empirical Aesthetics, is the paper’s lead author. Other authors are Eduardo Undurraga, an assistant professor at the Pontifical Catholic University of Chile; Malinda McPherson, a graduate student in the Harvard/MIT Program in Speech and Hearing Bioscience and Technology; Joaquin Valdes, a graduate student at the Pontifical Catholic University of Chile; and Tomas Ossandon, an assistant professor at the Pontifical Catholic University of Chile.

Octaves apart

Cross-cultural studies of how music is perceived can shed light on the interplay between biological constraints and cultural influences that shape human perception. McDermott’s lab has performed several such studies with the participation of Tsimane’ tribe members, who live in relative isolation from Western culture and have had little exposure to Western music.

In a study published in 2016, McDermott and his colleagues found that Westerners and Tsimane’ had different aesthetic reactions to chords, or combinations of notes. To Western ears, the combination of C and F# is very grating, but Tsimane’ listeners rated this chord just as likeable as other chords that Westerners would interpret as more pleasant, such as C and G.

Later, Jacoby and McDermott found that both Westerners and Tsimane’ are drawn to musical rhythms composed of simple integer ratios, but the ratios they favor are different, based on which rhythms are more common in the music they listen to.

In their new study, the researchers studied pitch perception using an experimental design in which they play a very simple tune, only two or three notes, and then ask the listener to sing it back. The notes that were played could come from any octave within the range of human hearing, but listeners sang their responses within their vocal range, usually restricted to a single octave.

pitch perception experiment
Eduardo Undurraga, an assistant professor at the Pontifical Catholic University of Chile, runs a musical pitch perception experiment with a member of the Tsimane’ tribe of the Bolivian rainforest. Photo: Josh McDermott

Western listeners, especially those who were trained musicians, tended to reproduce the tune an exact number of octaves above or below what they heard, though they were not specifically instructed to do so. In Western music, the pitch of the same note doubles with each ascending octave, so tones with frequencies of 27.5 hertz, 55 hertz, 110 hertz, 220 hertz, and so on, are all heard as the note A.

Western listeners in the study, all of whom lived in New York or Boston, accurately reproduced sequences such as A-C-A, but in a different register, as though they hear the similarity of notes separated by octaves. However, the Tsimane’ did not.

“The relative pitch was preserved (between notes in the series), but the absolute pitch produced by the Tsimane’ didn’t have any relationship to the absolute pitch of the stimulus,” Jacoby says. “That’s consistent with the idea that perceptual similarity is something that we acquire from exposure to Western music, where the octave is structurally very important.”

The ability to reproduce the same note in different octaves may be honed by singing along with others whose natural registers are different, or singing along with an instrument being played in a different pitch range, Jacoby says.

Limits of perception

The study findings also shed light on the upper limits of pitch perception for humans. It has been known for a long time that Western listeners cannot accurately distinguish pitches above about 4,000 hertz, although they can still hear frequencies up to nearly 20,000 hertz. In a traditional 88-key piano, the highest note is about 4,100 hertz.

People have speculated that the piano was designed to go only that high because of a fundamental limit on pitch perception, but McDermott thought it could be possible that the opposite was true: That is, the limit was culturally influenced by the fact that few musical instruments produce frequencies higher than 4,000 hertz.

The researchers found that although Tsimane’ musical instruments usually have upper limits much lower than 4,000 hertz, Tsimane’ listeners could distinguish pitches very well up to about 4,000 hertz, as evidenced by accurate sung reproductions of those pitch intervals. Above that threshold, their perceptions broke down, very similarly to Western listeners.

“It looks almost exactly the same across groups, so we have some evidence for biological constraints on the limits of pitch,” Jacoby says.

One possible explanation for this limit is that once frequencies reach about 4,000 hertz, the firing rates of the neurons of our inner ear can’t keep up and we lose a critical cue with which to distinguish different frequencies.

“The new study contributes to the age-long debate about the interplays between culture and biological constraints in music,” says Daniel Pressnitzer, a senior research scientist at Paris Descartes University, who was not involved in the research. “This unique, precious, and extensive dataset demonstrates both striking similarities and unexpected differences in how Tsimane’ and Western listeners perceive or conceive musical pitch.”

Jacoby and McDermott now hope to expand their cross-cultural studies to other groups who have had little exposure to Western music, and to perform more detailed studies of pitch perception among the Tsimane’.

Such studies have already shown the value of including research participants other than the Western-educated, relatively wealthy college undergraduates who are the subjects of most academic studies on perception, McDermott says. These broader studies allow researchers to tease out different elements of perception that cannot be seen when examining only a single, homogenous group.

“We’re finding that there are some cross-cultural similarities, but there also seems to be really striking variation in things that a lot of people would have presumed would be common across cultures and listeners,” McDermott says. “These differences in experience can lead to dissociations of different aspects of perception, giving you clues to what the parts of the perceptual system are.”

The research was funded by the James S. McDonnell Foundation, the National Institutes of Health, and the Presidential Scholar in Society and Neuroscience Program at Columbia University.

Finding the brain’s compass

The world is constantly bombarding our senses with information, but the ways in which our brain extracts meaning from this information remains elusive. How do neurons transform raw visual input into a mental representation of an object – like a chair or a dog?

In work published today in Nature Neuroscience, MIT neuroscientists have identified a brain circuit in mice that distills “high-dimensional” complex information about the environment into a simple abstract object in the brain.

“There are no degree markings in the external world, our current head direction has to be extracted, computed, and estimated by the brain,” explains Ila Fiete, an associate member of the McGovern Institute and senior author of the paper. “The approaches we used allowed us to demonstrate the emergence of a low-dimensional concept, essentially an abstract compass in the brain.”

This abstract compass, according to the researchers, is a one-dimensional ring that represents the current direction of the head relative to the external world.

Schooling fish

Trying to show that a data cloud has a simple shape, like a ring, is a bit like watching a school of fish. By tracking one or two sardines, you might not see a pattern. But if you could map all of the sardines, and transform the noisy dataset into points representing the positions of the whole school of sardines over time, and where each fish is relative to its neighbors, a pattern would emerge. This model would reveal a ring shape, a simple shape formed by the activity of hundreds of individual fish.

Fiete, who is also an associate professor in MIT’s Department of Brain and Cognitive Sciences, used a similar approach, called topological modeling, to transform the activity of large populations of noisy neurons into a data cloud the shape of a ring.

Simple and persistent ring

Previous work in fly brains revealed a physical ellipsoid ring of neurons representing changes in the direction of the fly’s head, and researchers suspected that such a system might also exist in mammals.

In this new mouse study, Fiete and her colleagues measured hours of neural activity from scores of neurons in the anterodorsal thalamic nucleus (ADN) – a region believed to play a role in spatial navigation – as the animals moved freely around their environment. They mapped how the neurons in the ADN circuit fired as the animal’s head changed direction.

Together these data points formed a cloud in the shape of a simple and persistent ring.

“In the absence of this ring,” Fiete explains, “we would be lost in the world.”

“This tells us a lot about how neural networks are organized in the brain,” explains Edvard Moser, Director of the Kavli Institute of Systems Neuroscience in Norway, who was not involved in the study. “Past data have indirectly pointed towards such a ring-like organization but only now has it been possible, with the right cell numbers and methods, to demonstrate it convincingly,” says Moser.

Their method for characterizing the shape of the data cloud allowed Fiete and colleagues to determine which variable the circuit was devoted to representing, and to decode this variable over time, using only the neural responses.

“The animal’s doing really complicated stuff,” explains Fiete, “but this circuit is devoted to integrating the animal’s speed along a one-dimensional compass that encodes head direction,” explains Fiete. “Without a manifold approach, which captures the whole state space, you wouldn’t know that this circuit of thousands of neurons is encoding only this one aspect of the complex behavior, and not encoding any other variables at the same time.”

Even during sleep, when the circuit is not being bombarded with external information, this circuit robustly traces out the same one-dimensional ring, as if dreaming of past head direction trajectories.

Further analysis revealed that the ring acts an attractor. If neurons stray off trajectory, they are drawn back to it, quickly correcting the system. This attractor property of the ring means that the representation of head direction in abstract space is reliably stable over time, a key requirement if we are to understand and maintain a stable sense of where our head is relative to the world around us.

“In the absence of this ring,” Fiete explains, “we would be lost in the world.”

Shaping the future

Fiete’s work provides a first glimpse into how complex sensory information is distilled into a simple concept in the mind, and how that representation autonomously corrects errors, making it exquisitely stable.

But the implications of this study go beyond coding of head direction.

“Similar organization is probably present for other cognitive functions so the paper is likely to inspire numerous new studies,” says Moser.

Fiete sees these analyses and related studies carried out by colleagues at the Norwegian University of Science and Technology, Princeton University, the Weitzman Institute, and elsewhere as fundamental to the future of neural decoding studies.

With this approach, she explains, it is possible to extract abstract representations of the mind from the brain, potentially even thoughts and dreams.

“We’ve found that the brain deconstructs and represents complex things in the world with simple shapes,” explains Fiete. “Manifold-level analysis can help us to find those shapes, and they almost certainly exist beyond head direction circuits.”