Аннотация
Adversarial examples are a pervasive phenomenon of machine learning models
where seemingly imperceptible perturbations to the input lead to
misclassifications for otherwise statistically accurate models. We propose a
geometric framework, drawing on tools from the manifold reconstruction
literature, to analyze the high-dimensional geometry of adversarial examples.
In particular, we highlight the importance of codimension: for low-dimensional
data manifolds embedded in high-dimensional space there are many directions off
the manifold in which to construct adversarial examples. Adversarial examples
are a natural consequence of learning a decision boundary that classifies the
low-dimensional data manifold well, but classifies points near the manifold
incorrectly. Using our geometric framework we prove (1) a tradeoff between
robustness under different norms, (2) that adversarial training in balls around
the data is sample inefficient, and (3) sufficient sampling conditions under
which nearest neighbor classifiers and ball-based adversarial training are
robust.
Пользователи данного ресурса
Пожалуйста,
войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)