Abstract
Adversarial examples are carefully perturbed in-puts for fooling machine learning models. A well-acknowledged defense method against such examples is adversarial training, where adversarial examples are injected into training data to increase robustness. In this paper, we propose a new attack to unveil an undesired property of the state-of-the-art adversarial training, that is it fails to obtain robustness against perturbations in $\ell_2$ and $\ell_\infty$ norms simultaneously. We discuss a possible solution to this issue and its limitations as well.
Original language | Undefined/Unknown |
---|---|
Journal | Arxiv preprint |
State | Published - May 15 2019 |
Externally published | Yes |
Bibliographical note
4 pages, 2 figures, presented at the ICML 2019 Workshop on Uncertainty and Robustness in Deep Learning. arXiv admin note: text overlap with arXiv:1809.03113Keywords
- cs.LG
- cs.CR
- stat.ML