Misc,

On the Geometry of Adversarial Examples

M. Khoury, and D. Hadfield-Menell.
(2018)cite arxiv:1811.00525Comment: Improvements to clarity and presentation over initial submission.

Abstract

Adversarial examples are a pervasive phenomenon of machine learning models where seemingly imperceptible perturbations to the input lead to misclassifications for otherwise statistically accurate models. We propose a geometric framework, drawing on tools from the manifold reconstruction literature, to analyze the high-dimensional geometry of adversarial examples. In particular, we highlight the importance of codimension: for low-dimensional data manifolds embedded in high-dimensional space there are many directions off the manifold in which to construct adversarial examples. Adversarial examples are a natural consequence of learning a decision boundary that classifies the low-dimensional data manifold well, but classifies points near the manifold incorrectly. Using our geometric framework we prove (1) a tradeoff between robustness under different norms, (2) that adversarial training in balls around the data is sample inefficient, and (3) sufficient sampling conditions under which nearest neighbor classifiers and ball-based adversarial training are robust.

BibTeX key: khoury2018geometry
entry type: misc
year: 2018
url: http://arxiv.org/abs/1811.00525
note: cite arxiv:1811.00525Comment: Improvements to clarity and presentation over initial submission

BibSonomy

On the Geometry of Adversarial Examples

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on