Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations



Published on


Current state-of-the-art object recognition models are largely based on convolutional neural network (CNN) architectures, which are loosely inspired by the primate visual system. However, these CNNs can be fooled by imperceptibly small, explicitly crafted perturbations, and struggle to recognize objects in corrupted images that are easily recognized by humans. Here, by making comparisons with primate neural data, we first observed that CNN models with a neural hidden layer that better matches primate primary visual cortex (V1) are also more robust to adversarial attacks. Inspired by this observation, we developed VOneNets, a new class of hybrid CNN vision models. Each VOneNet contains a fixed weight neural network front-end that simulates primate V1, called the VOneBlock, followed by a neural network back-end adapted from current CNN vision models. The VOneBlock is based on a classical neuroscientific model of V1: the linear-nonlinear-Poisson model, consisting of a biologically-constrained Gabor filter bank, simple and complex cell nonlinearities, and a V1 neuronal stochasticity generator. After training, VOneNets retain high ImageNet performance, but each is substantially more robust, outperforming the base CNNs and state-of-the-art methods by 18% and 3%, respectively, on a conglomerate benchmark of perturbations comprised of white box adversarial attacks and common image corruptions. Finally, we show that all components of the VOneBlock work in synergy to improve robustness. While current CNN architectures are arguably brain-inspired, the results presented here demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in ImageNet-level computer vision applications.

This paper has been published as a spotlight at the 2020 Neural Information Processing Systems (NeurIPS) conference.

IBM and MIT researchers find a new way to prevent deep learning hacks

Deep learning may have revolutionized AI – boosting progress in computer vision and natural language processing and impacting nearly every industry. But even deep learning isn’t immune to hacking.

Specifically, it’s vulnerable to a curious form of hacking dubbed ‘adversarial examples.’ It’s when a hacker very subtly changes an input in a specific way – such as imperceptibly altering the pixels of an image or the words in a sentence – forcing the deep learning system to catastrophically fail.

AI has to be robust to withstand such attacks – and adversarial robustness also extends to its level of defenses against ‘natural’ adversaries, be it white noise, black-outs, image corruption, text typos or unseen data. While computer vision models are advancing rapidly, it’s possible to make them more robust by exposing them to subtly altered images through adversarial training. But this process is computationally expensive and imperfect; there will always be outlier images that may trip the model up.

And this is what recent research described in a paper presented at this year’s NeurIPS conference aims to change.

In the study, a team of neuroscientists from MIT and the MIT-IBM Watson AI Lab investigated how neuroscience and AI can inform one another. They’ve explored whether the human brain can offer clues on how to make deep neural networks (DNNs) even more powerful and secure. Turns out it can.

The paper describes a new biology-inspired model, dubbed VOneNet (for a specific region of the brain called V1), based on learning from the brain that can help address malicious adversarial attacks of AI models.

The research was led by Harvard graduate student Joel Dapello, the head of MIT’s Department of Brain and Cognitive Sciences James DiCarlo, and Tiago Marques, an MIT postdoc. They worked together with MIT graduate student Martin Schrimpf, MIT visiting student Franziska Geiger, and MIT-IBM Watson AI Lab Co-director David Cox – to gain insight from the brain’s truly mysterious ways.

Understanding the brain

By its very nature, deep learning or deep neural networks (DNNs) is loosely based on the functioning of the brain, inspired by the structure of biological nervous systems. Deep neural networks are composed of individual ‘cells’ – neurons – connected to each other by ‘synpases’.  “Like in the brain, organizing these elements in a ‘deep’ hierarchy of successive processing stages gives the artificial deep neural networks much of their power,” says IBM researcher David Cox.

However, adversarial attacks highlight a big difference in how deep neural networks and our brains perceive the world. Humans are not fooled at all by the subtle alterations that are able to trick deep neural networks, and our visual systems seem to be substantially more robust. Animal camouflage and optical illusions are probably the closest equivalent to adversarial examples against our brains.

But with a machine, it’s possible to carefully perturb the pixels in the image of a stop sign to trick a deep learning-based computer vision system into misclassifying it as a speed limit sign or anything else the adversary chooses, even though the image looks unchanged to the human eye. It is even possible to create physical objects that will trick AI-based systems, irrespective of the direction the object is viewed from, or how it is lit.

While researchers have made some progress in defending against these kinds of attacks, first discovered in 2013, they are still a serious barrier to a wide deployment of deep learning-based systems. The current approach, called adversarial training, is also extremely computationally expensive. And this is exactly what the new research paper is trying to address.

Learning from biology

The MIT-IBM collaboration has been uncovering useful tricks from neuroscience to infuse into our AI systems for years. Recently, the DiCarlo Lab has developed metrics for comparing data collected from the human brain with artificial neural networks, to understand which systems are closer or further away from biology.

In the latest study, the team explored the adversarial robustness of different models and studied if that was related to how similar they were to the brain. “To our surprise, we have found a strong relationship,” says Cox. “The more adversarially robust a model was, the more closely it seemed to match a particular brain area—V1, the first processing stage of visual information in the cerebral cortex.”

So the team decided to add some well-known elements of V1 processing in the input-stage of a standard DNN. They found out that this addition made any model substantially more robust. On top of that, including this block doesn’t add any more complexity or training cost to the models. It’s much computationally cheaper than the typical adversarial training, and surprisingly effective. It also confers robustness against other kinds of image degradation, like adding noise.

Their brain-inspired model, VOneNet, outperforms the state-of-the-art white-box attacks, where the attacker has access to the model architecture. It also outperforms black-box attacks, where the attacker has no visibility inside. And it does so with little added cost.

While impressive, “there’s certainly more work to be done to ensure models are invulnerable to adversarial attacks,” says Cox. And it’s not just a problem for computer vision. What’s clear, Cox adds, is that this research shows the need to keep learning from neuroscience to further boost adversarial robustness – and vice versa, to understand why something works in an artificial system, and how it can possibly help improve our still limited understanding of the human brain.

This post originally appeared on the IBM Research Blog.

Please cite our work using the BibTeX below.

@article {Dapello2020.06.16.154542,
	author = {Dapello, Joel and Marques, Tiago and Schrimpf, Martin and Geiger, Franziska and Cox, David D. and DiCarlo, James J.},
	title = {Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations},
	elocation-id = {2020.06.16.154542},
	year = {2020},
	doi = {10.1101/2020.06.16.154542},
	publisher = {Cold Spring Harbor Laboratory},
Close Modal