TY - JOUR
T1 - Wild Patterns Reloaded
T2 - A Survey of Machine Learning Security against Training Data Poisoning
AU - Cinà, Antonio Emanuele
AU - Grosse, Kathrin
AU - Demontis, Ambra
AU - Vascon, Sebastiano
AU - Zellinger, Werner
AU - Moser, Bernhard A.
AU - Oprea, Alina
AU - Biggio, Battista
AU - Pelillo, Marcello
AU - Roli, Fabio
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2023/12/31
Y1 - 2023/12/31
N2 - The success of machine learning is fueled by the increasing availability of computing power and large training datasets. The training data is used to learn new models or update existing ones, assuming that it is sufficiently representative of the data that will be encountered at test time. This assumption is challenged by the threat of poisoning, an attack that manipulates the training data to compromise the model’s performance at test time. Although poisoning has been acknowledged as a relevant threat in industry applications, and a variety of different attacks and defenses have been proposed so far, a complete systematization and critical review of the field is still missing. In this survey, we provide a comprehensive systematization of poisoning attacks and defenses in machine learning, reviewing more than 100 papers published in the field in the past 15 years. We start by categorizing the current threat models and attacks and then organize existing defenses accordingly. While we focus mostly on computer-vision applications, we argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities. Finally, we discuss existing resources for research in poisoning and shed light on the current limitations and open research questions in this research field.
AB - The success of machine learning is fueled by the increasing availability of computing power and large training datasets. The training data is used to learn new models or update existing ones, assuming that it is sufficiently representative of the data that will be encountered at test time. This assumption is challenged by the threat of poisoning, an attack that manipulates the training data to compromise the model’s performance at test time. Although poisoning has been acknowledged as a relevant threat in industry applications, and a variety of different attacks and defenses have been proposed so far, a complete systematization and critical review of the field is still missing. In this survey, we provide a comprehensive systematization of poisoning attacks and defenses in machine learning, reviewing more than 100 papers published in the field in the past 15 years. We start by categorizing the current threat models and attacks and then organize existing defenses accordingly. While we focus mostly on computer-vision applications, we argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities. Finally, we discuss existing resources for research in poisoning and shed light on the current limitations and open research questions in this research field.
KW - backdoor attacks
KW - computer security
KW - computer vision
KW - machine learning
KW - Poisoning attacks
UR - https://www.scopus.com/pages/publications/85165650976
U2 - 10.1145/3585385
DO - 10.1145/3585385
M3 - Article
AN - SCOPUS:85165650976
SN - 0360-0300
VL - 55
JO - ACM Computing Surveys
JF - ACM Computing Surveys
IS - 13
M1 - 294
ER -