Article,

Are Perceptually-Aligned Gradients a General Property of Robust Classifiers?

, , and .
(2019)cite arxiv:1910.08640Comment: To appear in the "Science Meets Engineering of Deep Learning" Workshop at NeurIPS 2019.

Abstract

For a standard convolutional neural network, optimizing over the input pixels to maximize the score of some target class will generally produce a grainy-looking version of the original image. However, researchers have demonstrated that for adversarially-trained neural networks, this optimization produces images that uncannily resemble the target class. In this paper, we show that these "perceptually-aligned gradients" also occur under randomized smoothing, an alternative means of constructing adversarially-robust classifiers. Our finding suggests that perceptually-aligned gradients may be a general property of robust classifiers, rather than a specific property of adversarially-trained neural networks. We hope that our results will inspire research aimed at explaining this link between perceptually-aligned gradients and adversarial robustness.

Tags

Users

  • @kirk86

Comments and Reviews