Research

An Adversarial Framework for Generating Unseen Images by Activation Maximization

Paper

Authors

Published on

05/01/2022

Activation maximization (AM) refers to the task of generating input examples that maximize the activation of a target class of a classifier, which can be used for class-conditional image generation and model interpretation. A popular class of AM method, GAN-based AM, introduces a GAN pre-trained on a large image set, and performs AM over its input random seed or style embeddings, so that the generated images are natural and adversarial attacks are prevented. Most of these methods would require the image set to contain some images of the target class to be visualized. Otherwise they tend to generate other seen class images that most maximizes the target class activation. In this paper, we aim to tackle the case where information about the target class is completely removed from the image set. This would ensure that the generated images truly reflect the target class information residing in the classifier, not the target class information in the image set, which contributes to a more faithful interpretation technique. To this end, we propose PROBEGAN, a GANbased AM algorithm capable of generating image classes unseen in the image set. Rather than using a pre-trained GAN, PROBEGAN trains a new GAN with AM explicitly included in its training objective. PROBEGAN consists of a classconditional generator, a seen-class discriminator, and an allclass unconditional discriminator. It can be shown that such a framework can generate images with the features of the unseen target class, while retaining the naturalness as depicted in the image set. Experiments have shown that PROBEGAN can generate unseen-class images with much higher quality than the baselines. We also explore using PROBEGAN as a model interpretation tool.1